Selective Video Object Cutout

Conventional video segmentation approaches rely heavily on appearance models. Such methods often use appearance descriptors that have limited discriminative power under complex scenarios. To improve the segmentation performance, this paper presents a pyramid histogram-based confidence map that incorporates structure information into appearance statistics. It also combines geodesic distance-based dynamic models. Then, it employs an efficient measure of uncertainty propagation using local classifiers to determine the image regions, where the object labels might be ambiguous. The final foreground cutout is obtained by refining on the uncertain regions. Additionally, to reduce manual labeling, our method determines the frames to be labeled by the human operator in a principled manner, which further boosts the segmentation performance and minimizes the labeling effort. Our extensive experimental analyses on two big benchmarks demonstrate that our solution achieves superior performance, favorable computational efficiency, and reduced manual labeling in comparison to the state of the art.

[1]  Michal Irani,et al.  Video Segmentation by Non-Local Consensus voting , 2014, BMVC.

[2]  Ling Shao,et al.  Video Co-Saliency Guided Co-Segmentation , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Bingbing Ni,et al.  Video Object Segmentation Via Dense Trajectories , 2015, IEEE Transactions on Multimedia.

[4]  Atsushi Nakazawa,et al.  Motion Coherent Tracking Using Multi-label MRF Optimization , 2012, International Journal of Computer Vision.

[5]  Longin Jan Latecki,et al.  Maximum weight cliques with mutex constraints for video object segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  R. Venkatesh Babu,et al.  SeamSeg: Video Object Segmentation Using Patch Seams , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Michael J. Black,et al.  Video Segmentation via Object Flow , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Xiangxu Meng,et al.  Discontinuity-aware video object cutout , 2012, ACM Trans. Graph..

[12]  James M. Rehg,et al.  Motion Coherent Tracking with Multi-label MRF optimization , 2010, BMVC.

[13]  Dani Lischinski,et al.  JumpCut , 2015, ACM Trans. Graph..

[14]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Alexander Sorkine-Hornung,et al.  Bilateral Space Video Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Wenguan Wang,et al.  Occlusion-Aware Real-Time Object Tracking , 2017, IEEE Transactions on Multimedia.

[18]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jitendra Malik,et al.  Learning to segment moving objects in videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[21]  Ling Shao,et al.  Correspondence Driven Saliency Transfer , 2016, IEEE Transactions on Image Processing.

[22]  Katerina Fragkiadaki,et al.  Video segmentation by tracing discontinuities in a trajectory embedding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Ignas Budvytis,et al.  Semi-Supervised Video Segmentation Using Tree Structured Graphical Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jianbing Shen,et al.  Real-Time Superpixel Segmentation by DBSCAN Clustering Algorithm. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[26]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[27]  Kristen Grauman,et al.  Supervoxel-Consistent Foreground Propagation in Video , 2014, ECCV.

[28]  Chenliang Xu,et al.  Streaming Hierarchical Video Segmentation , 2012, ECCV.

[29]  Ming-Hsuan Yang,et al.  JOTS: Joint Online Tracking and Segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[31]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[32]  Roberto Cipolla,et al.  Label propagation in video sequences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Scott Cohen,et al.  LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Harry Shum,et al.  Video object cut and paste , 2005, ACM Trans. Graph..

[35]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[36]  John W. Fisher,et al.  A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Kristen Grauman,et al.  Active Frame Selection for Label Propagation in Videos , 2012, ECCV.

[38]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Ling Shao,et al.  Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement , 2015, IEEE Transactions on Image Processing.

[40]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Yong Jae Lee,et al.  Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Ruigang Yang,et al.  Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Xuelong Li,et al.  Robust Video Object Cosegmentation , 2015, IEEE Transactions on Image Processing.

[44]  Markus H. Gross,et al.  Fully Connected Object Proposals for Video Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Xuelong Li,et al.  Lazy Random Walks for Superpixel Segmentation , 2014, IEEE Transactions on Image Processing.

[46]  Thomas Brox,et al.  Video Segmentation with Just a Few Strokes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, ACM Trans. Graph..