Wide-baseline multi-view video segmentation for 3D reconstruction

Obtaining a foreground silhouette across multiple views is one of the fundamental steps in 3D reconstruction. In this paper we present a novel video segmentation approach, to obtain a foreground silhouette, for scenes captured by a wide-baseline camera rig given a sparse manual interaction in a single view. The algorithm is based on trimap propagation, a framework used in video matting. Bayesian inference coupled with camera calibration information are used to spatio-temporally propagate high confidence trimap labels across the multi-view video to obtain coarse silhouettes which are later refined using a matting algorithm. Recent techniques have been developed for foreground segmentation, based on image matting, in multiple views but they are limited to narrow baseline with low foreground variation. The proposed wide-baseline silhouette propagation is robust to inter-view foreground appearance changes, shadows and similarity in foreground/background appearance. The approach has demonstrated good performance in silhouette estimation for views up to 180 degree baseline (opposing views). The segmentation technique has been fully integrated in a multi-view reconstruction pipeline. The results obtained demonstrate the suitability of the technique for multi-view reconstruction with wide-baseline camera set-ups and natural background

[1]  Jean-Yves Guillemaut,et al.  Non-parametric Patch based Video Matting , 2009, BMVC.

[2]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yo-Sung Ho,et al.  Multi-View Image Matting and Compositing Using Trimap Sharing for Natural 3-D Scene Generation , 2008, 2008 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[4]  A. Hilton,et al.  Dynamic 3D Scene Reconstruction in Outdoor Environments , 2010 .

[5]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[7]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[8]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[9]  Wojciech Matusik,et al.  Natural video matting using camera arrays , 2006, SIGGRAPH '06.

[10]  Harry Shum,et al.  Video object cut and paste , 2005, ACM Trans. Graph..

[11]  Andrew W. Fitzgibbon,et al.  Bayesian video matting using learnt image priors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Jean-Yves Guillemaut,et al.  Stereoscopic content production of complex dynamic scenes using a wide-baseline monoscopic camera set-up , 2010, 2010 IEEE International Conference on Image Processing.

[13]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[14]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[15]  Jean-Yves Guillemaut,et al.  Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, ACM Trans. Graph..

[17]  Roberto Cipolla,et al.  Automatic 3D object segmentation in multiple views using volumetric graph-cuts , 2007, Image Vis. Comput..