Video Segmentation with Superpixels

Due to its importance, video segmentation has regained interest recently. However, there is no common agreement about the necessary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmentation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second contribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods.

[1]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[3]  Daniel DeMenthon,et al.  SPATIO-TEMPORAL SEGMENTATION OF VIDEO BY HIERARCHICAL MEAN SHIFT ANALYSIS , 2002 .

[4]  Hayit Greenspan,et al.  A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing , 2002, ECCV.

[5]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[6]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Tony F. Chan,et al.  A Multiphase Level Set Framework for Image Segmentation Using the Mumford and Shah Model , 2002, International Journal of Computer Vision.

[8]  Andrew Zisserman,et al.  Learning Layered Motion Segmentations of Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Brendan J. Frey,et al.  Generative Model for Layers of Appearance and Deformation , 2005, AISTATS.

[10]  Roberto Cipolla,et al.  Unsupervised Bayesian Detection of Independent Motion in Crowds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[12]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[13]  Sylvain Paris,et al.  Edge-Preserving Smoothing and Mean-Shift Segmentation of Video Streams , 2008, ECCV.

[14]  Daniel Cremers,et al.  An algorithm for minimizing the Mumford-Shah functional , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Takahiro Okabe,et al.  Using individuality to track individuals: Clustering individual trajectories in crowds using local appearance and frequency trait , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  William Brendel,et al.  Video object segmentation by tracking regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Cristian Sminchisescu,et al.  Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Radim Sára,et al.  A Weak Structure Model for Regular Pattern Recognition Applied to Facade Images , 2010, ACCV.

[20]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[21]  Sven J. Dickinson,et al.  Spatiotemporal Closure , 2010, ACCV.

[22]  Eric L. Miller,et al.  Multiple Hypothesis Video Segmentation from Superpixel Flows , 2010, ECCV.

[23]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[24]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[25]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[27]  Ivan Laptev,et al.  Track to the future: Spatio-temporal video segmentation with long-range motion cues , 2011, CVPR 2011.

[28]  Jitendra Malik,et al.  Occlusion boundary detection and figure/ground assignment from optical flow , 2011, CVPR 2011.

[29]  Roberto Cipolla,et al.  Spatio-temporal clustering of probabilistic region trajectories , 2011, 2011 International Conference on Computer Vision.

[30]  Thomas Brox,et al.  Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions , 2011, 2011 International Conference on Computer Vision.

[31]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[32]  Narendra Ahuja,et al.  Exploiting nonlocal spatiotemporal structure for video segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Chenliang Xu,et al.  Evaluation of super-voxel methods for early video processing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .