Key-segments for video object segmentation

We present an approach to discover and segment foreground object(s) in video. Given an unannotated video sequence, the method first identifies object-like regions in any frame according to both static and dynamic cues. We then compute a series of binary partitions among those candidate “key-segments” to discover hypothesis groups with persistent appearance and motion. Finally, using each ranked hypothesis in turn, we estimate a pixel-level object labeling across all frames, where (a) the foreground likelihood depends on both the hypothesis's appearance as well as a novel localization prior based on partial shape matching, and (b) the background likelihood depends on cues pulled from the key-segments' (possibly diverse) surroundings observed across the sequence. Compared to existing methods, our approach automatically focuses on the persistent foreground regions of interest while resisting oversegmentation. We apply our method to challenging benchmark videos, and show competitive or better results than the state-of-the-art.

[1]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[2]  James M. Rehg,et al.  Motion Coherent Tracking with Multi-label MRF optimization , 2010, BMVC.

[3]  Nuno Vasconcelos,et al.  The discriminant center-surround hypothesis for bottom-up saliency , 2007, NIPS.

[4]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[7]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[8]  Antonio Torralba,et al.  Unsupervised Detection of Regions of Interest Using Iterative Link Analysis , 2009, NIPS.

[9]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[10]  Pietro Perona,et al.  A Factorization Approach to Grouping , 1998, ECCV.

[11]  Antonio Torralba,et al.  LabelMe video: Building a video database with human annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Luc Van Gool,et al.  Video mining with frequent itemset configurations , 2006 .

[13]  Andrew Blake,et al.  Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Edwin Olson,et al.  Single-Cluster Spectral Graph Partitioning for Robotics Applications , 2005, Robotics: Science and Systems.

[16]  Cristian Sminchisescu,et al.  Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Tsuhan Chen,et al.  DISCOV: A Framework for Discovering Objects in Video , 2008, IEEE Transactions on Multimedia.

[18]  Pieter Peers,et al.  SubEdit: a representation for editing measured heterogeneous subsurface scattering , 2009, SIGGRAPH 2009.

[19]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[20]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[21]  Scott Cohen,et al.  LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Jitendra Malik,et al.  Tracking as Repeated Figure/Ground Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Dimitris N. Metaxas,et al.  ]Video object segmentation by hypergraph cut , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Robert T. Collins,et al.  Shape constrained figure-ground segmentation and tracking , 2009, CVPR.

[26]  Eric L. Miller,et al.  Multiple Hypothesis Video Segmentation from Superpixel Flows , 2010, ECCV.

[27]  W. Marsden I and J , 2012 .

[28]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Yong Jae Lee,et al.  Collect-cut: Segmentation with top-down cues discovered in multi-object images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Stanley T. Birchfield,et al.  Adaptive fragments-based tracking of non-rigid objects using level sets , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  William Brendel,et al.  Video object segmentation by tracking regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, ACM Trans. Graph..