From Video Matching to Video Grounding

This paper addresses the background estimation problem for videos captured by moving cameras, referred to as video grounding. It essentially aims at reconstructing a video, as if it would be without foreground objects, e.g. cars or people. What differentiates video grounding from known background estimation methods is that the camera follows unconstrained motion so that background undergoes ongoing changes. We build on video matching aspects since more videos contribute to the reconstruction. Without loss of generality, we investigate a challenging case where videos are recorded by in-vehicle cameras that follow the same road. Other than video synchronization and spatiotemporal alignment, we focus on the background reconstruction by exploiting inter- and intra-sequence similarities. In this context, we propose a Markov random field formulation that integrates the temporal coherence of videos while it exploits the decisions of a support vector machine classifier about the background ness of regions in video frames. Experiments with real sequences recorded by moving vehicles verify the potential of the video grounding algorithm against state-of-art baselines.

[1]  Seth J. Teller,et al.  Video matching , 2004, Encyclopedia of Multimedia.

[2]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[3]  Radu Horaud,et al.  Camera cooperation for achieving visual attention , 2005, Machine Vision and Applications.

[4]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Takeo Kanade,et al.  Background Subtraction for Freely Moving Cameras , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Joan Serrat,et al.  Video Alignment for Change Detection , 2011, IEEE Transactions on Image Processing.

[7]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[9]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Andrew Zisserman,et al.  Get Out of my Picture! Internet-based Inpainting , 2009, BMVC.

[11]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Different Scenes , 2008, ECCV.

[12]  Guillermo Sapiro,et al.  Video Inpainting Under Constrained Camera Motion , 2007, IEEE Transactions on Image Processing.

[13]  Radu Horaud,et al.  Motion Panoramas , 2004, Comput. Animat. Virtual Worlds.

[14]  Eli Shechtman,et al.  Space-Time Completion of Video , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Christian Bauckhage,et al.  Efficient and Robust Alignment of Unsynchronized Video Sequences , 2011, DAGM-Symposium.

[16]  Guillermo Sapiro,et al.  Navier-stokes, fluid dynamics, and image and video inpainting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[18]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Scott Cohen,et al.  Background estimation as a labeling problem , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Yufeng Shen,et al.  Background estimation using graph cuts and inpainting , 2010, Graphics Interface.

[21]  Xiaochun Cao,et al.  Video Completion for Perspective Camera Under Constrained Motion , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[22]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..