An unsupervised video foreground co-localization and segmentation process by incorporating motion cues and frame features

Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.

[1]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[3]  Cordelia Schmid,et al.  Learning object class detectors from weakly annotated video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Xiaodong Gu,et al.  An Information Theoretic Model of Spatiotemporal Visual Saliency , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[7]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[8]  Jean Ponce,et al.  Unsupervised Object Discovery and Tracking in Video Collections , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Gary Bradski,et al.  Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[10]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Michael F. Cohen,et al.  Optimized Color Sampling for Robust Matting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Pong C. Yuen,et al.  Object motion detection using information theoretic spatio-temporal saliency , 2009, Pattern Recognit..

[14]  Cordelia Schmid,et al.  Learning Semantic Segmentation with Weakly-Annotated Videos , 2016 .

[15]  Yang Wang,et al.  Efficient Object Localization and Segmentation in Weakly Labeled Videos , 2014, ISVC.

[16]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[17]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[18]  Fei-Fei Li,et al.  Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.

[19]  Marius Leordeanu,et al.  Multiple Frames Matching for Object Discovery in Video , 2015, BMVC.

[20]  Chang-Su Kim,et al.  Primary Object Segmentation in Videos via Alternate Convex Optimization of Foreground and Background Distributions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Chang-Su Kim,et al.  POD: Discovering Primary Objects in Videos Based on Evolutionary Refinement of Object Recurrence, Background, and Primary Object Models , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[23]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Cordelia Schmid,et al.  Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Cordelia Schmid,et al.  Spatio-temporal Object Detection Proposals , 2014, ECCV.

[27]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..