Fast Appearance Modeling for Automatic Primary Video Object Segmentation

Automatic segmentation of the primary object in a video clip is a challenging problem as there is no prior knowledge of the primary object. Most existing techniques thus adapt an iterative approach for foreground and background appearance modeling, i.e., fix the appearance model while optimizing the segmentation and fix the segmentation while optimizing the appearance model. However, these approaches may rely on good initialization and can be easily trapped in local optimal. In addition, they are usually time consuming for analyzing videos. To address these limitations, we propose a novel and efficient appearance modeling technique for automatic primary video object segmentation in the Markov random field (MRF) framework. It embeds the appearance constraint as auxiliary nodes and edges in the MRF structure, and can optimize both the segmentation and appearance model parameters simultaneously in one graph cut. The extensive experimental evaluations validate the superiority of the proposed approach over the state-of-the-art methods, in both efficiency and effectiveness.

[1]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Mubarak Shah,et al.  Video Object Co-segmentation by Regulated Maximum Weight Cliques , 2014, ECCV.

[3]  Gang Hua,et al.  Topical video object discovery from key frames by modeling word co-occurrence prior , 2013, IEEE Transactions on Image Processing.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[6]  Stephen Lin,et al.  Object-Based Multiple Foreground Video Co-segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Junsong Yuan,et al.  Discovering Primary Objects in Videos by Saliency Fusion and Iterative Appearance Estimation , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Esa Rahtu,et al.  Generating Object Segmentation Proposals Using Global and Local Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Nanning Zheng,et al.  Video Object Discovery and Co-Segmentation with Extremely Weak Supervision , 2017, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Luc Van Gool,et al.  SEEDS: Superpixels Extracted Via Energy-Driven Sampling , 2012, International Journal of Computer Vision.

[12]  Longin Jan Latecki,et al.  Maximum weight cliques with mutex constraints for video object segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[14]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Junji Yamato,et al.  Saliency-based video segmentation with graph cuts and sequentially updated priors , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[16]  Svetlana Lazebnik,et al.  Superparsing , 2010, International Journal of Computer Vision.

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  Lena Gorelick,et al.  GrabCut in One Cut , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[20]  Roberto Cipolla,et al.  Label propagation in video sequences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Scott Cohen,et al.  LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Harry Shum,et al.  Video object cut and paste , 2005, ACM Trans. Graph..

[23]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Cristian Sminchisescu,et al.  Video Object Segmentation by Salient Segment Chain Composition , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[26]  Junsong Yuan,et al.  Thematic Saliency Detection Using Spatial-Temporal Context , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[27]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[29]  Chanho Jung,et al.  A Unified Spectral-Domain Approach for Saliency Detection and Its Application to Automatic Object Segmentation , 2012, IEEE Transactions on Image Processing.

[30]  Aggelos K. Katsaggelos,et al.  Discovering Thematic Objects in Image Collections and Videos , 2012, IEEE Transactions on Image Processing.

[31]  Marc Chaumont,et al.  Segmentation of non-rigid video objects using long term temporal consistency , 2002, Proceedings. International Conference on Image Processing.

[32]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[34]  Qi Tian,et al.  Saliency Density Maximization for Efficient Visual Objects Discovery , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[36]  Huchuan Lu,et al.  Saliency Region Detection Based on Markov Absorption Probabilities , 2015, IEEE Transactions on Image Processing.

[37]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Atsushi Nakazawa,et al.  Motion Coherent Tracking Using Multi-label MRF Optimization , 2012, International Journal of Computer Vision.

[40]  Boris Mansencal,et al.  Multiple Moving Object Detection for Fast Video Content Description in Compressed Domain , 2008, EURASIP J. Adv. Signal Process..

[41]  Hailin Jin,et al.  Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Kristen Grauman,et al.  Supervoxel-Consistent Foreground Propagation in Video , 2014, ECCV.

[43]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Yu-Chiang Frank Wang,et al.  Exploring Visual and Motion Saliency for Automatic Video Object Extraction , 2013, IEEE Transactions on Image Processing.

[45]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Chenliang Xu,et al.  Evaluation of super-voxel methods for early video processing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Luc Van Gool,et al.  SEEDS: Superpixels Extracted via Energy-Driven Sampling , 2012, ECCV.

[49]  Qi Tian,et al.  Packing and Padding: Coupled Multi-index for Accurate Image Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.