Classifier based graph construction for video segmentation

Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top-performance on recent benchmarks, consist of three essential components: 1. powerful features account for object appearance and motion similarities; 2. spatio-temporal neighborhoods of pixels or superpixels (the graph edges) are modeled using a combination of those features; 3. video segmentation is formulated as a graph partitioning problem. While a wide variety of features have been explored and various graph partition algorithms have been proposed, there is surprisingly little research on how to construct a graph to obtain the best video segmentation performance. This is the focus of our paper. We propose to combine features by means of a classifier, use calibrated classifier outputs as edge weights and define the graph topology by edge selection. By learning the graph (without changes to the graph partitioning method), we improve the results of the best performing video segmentation algorithm by 6% on the challenging VSB100 benchmark, while reducing its runtime by 55%, as the learnt graph is much sparser.

[1]  Ulrike von Luxburg,et al.  Influence of graph construction on graph-based clustering measures , 2008, NIPS.

[2]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Ullrich Köthe,et al.  Probabilistic image segmentation with closedness constraints , 2011, 2011 International Conference on Computer Vision.

[4]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[5]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Cristian Sminchisescu,et al.  Video Object Segmentation by Salient Segment Chain Composition , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[7]  Cordelia Schmid,et al.  Spatio-temporal Object Detection Proposals , 2014, ECCV.

[8]  Katerina Fragkiadaki,et al.  Video segmentation by tracing discontinuities in a trajectory embedding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Xilin Chen,et al.  Multi-layer Spectral Clustering for Video Segmentation , 2012, ACCV.

[10]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Chenliang Xu,et al.  Streaming Hierarchical Video Segmentation , 2012, ECCV.

[12]  Katrin Kirchhoff,et al.  Data-Driven Graph Construction for Semi-Supervised Graph-Based Learning in NLP , 2007, NAACL.

[13]  Philippe Jean Salembier Clairon,et al.  Hierarchical video representation with trajectory binary partition tree , 2013, CVPR 2013.

[14]  Stella X. Yu,et al.  Progressive Multigrid Eigensolvers for Multiscale Spectral Segmentation , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Eric L. Miller,et al.  Multiple Hypothesis Video Segmentation from Superpixel Flows , 2010, ECCV.

[16]  Matthias Hein,et al.  Spectral clustering based on the graph p-Laplacian , 2009, ICML '09.

[17]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[18]  Edward H. Adelson,et al.  Crisp Boundary Detection Using Pointwise Mutual Information , 2014, ECCV.

[19]  Bernt Schiele,et al.  Video Segmentation with Superpixels , 2012, ACCV.

[20]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Alan L. Yuille,et al.  Efficient Multilevel Brain Tumor Segmentation With Integrated Bayesian Model Classification , 2008, IEEE Transactions on Medical Imaging.

[22]  René Vidal,et al.  Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Bodo Rosenhahn,et al.  Temporally Consistent Superpixels , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Narendra Ahuja,et al.  Exploiting nonlocal spatiotemporal structure for video segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Irfan A. Essa,et al.  Geometric Context from Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jitendra Malik,et al.  Occlusion boundary detection and figure/ground assignment from optical flow , 2011, CVPR 2011.

[28]  Xiaofeng Ren,et al.  Discriminatively Trained Sparse Code Gradients for Contour Detection , 2012, NIPS.

[29]  Chenliang Xu,et al.  Flattening Supervoxel Hierarchies by the Uniform Entropy Slice , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Ivan Laptev,et al.  Track to the future: Spatio-temporal video segmentation with long-range motion cues , 2011, CVPR 2011.

[31]  Ignas Budvytis,et al.  Mixture of Trees Probabilistic Graphical Model for Video Segmentation , 2013, International Journal of Computer Vision.

[32]  Bernt Schiele,et al.  Learning Must-Link Constraints for Video Segmentation Based on Spectral Clustering , 2014, GCPR.

[33]  Tony Jebara,et al.  B-Matching for Spectral Clustering , 2006, ECML.

[34]  Sebastian Nowozin,et al.  Task-Specific Image Partitioning , 2013, IEEE Transactions on Image Processing.

[35]  Santiago Manen,et al.  Online Video SEEDS for Temporal Window Objectness , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[37]  Stijn van Dongen,et al.  Graph Clustering Via a Discrete Uncoupling Process , 2008, SIAM J. Matrix Anal. Appl..

[38]  Kurt Keutzer,et al.  Long term video segmentation through pixel level spectral clustering on GPUs , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[39]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  H. Sebastian Seung,et al.  Maximin affinity learning of image segmentation , 2009, NIPS.

[41]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[42]  Camille Couprie,et al.  Power Watershed: A Unifying Graph-Based Optimization Framework , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[44]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[45]  Charless C. Fowlkes,et al.  Oriented edge forests for boundary detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[47]  Martial Hebert,et al.  Motion Words for Videos , 2014, ECCV.

[48]  Matthias Hein,et al.  An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA , 2010, NIPS.

[49]  Chenliang Xu,et al.  Evaluation of super-voxel methods for early video processing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Thomas Brox,et al.  Spectral Graph Reduction for Efficient Image and Streaming Video Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  John W. Fisher,et al.  A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Shih-Fu Chang,et al.  Graph construction and b-matching for semi-supervised learning , 2009, ICML '09.

[53]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[54]  Thomas Brox,et al.  A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis , 2013, 2013 IEEE International Conference on Computer Vision.

[55]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[56]  Sylvain Paris,et al.  Edge-Preserving Smoothing and Mean-Shift Segmentation of Video Streams , 2008, ECCV.

[57]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[58]  Fei-Fei Li,et al.  Discriminative Segment Annotation in Weakly Labeled Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.