Stereoscopic content production of complex dynamic scenes using a wide-baseline monoscopic camera set-up

Conventional stereoscopic video content production requires use of dedicated stereo camera rigs which is both costly and lacking video editing flexibility. In this paper, we propose a novel approach which only requires a small number of standard cameras sparsely located around a scene to automatically convert the monocular inputs into stereoscopic streams. The approach combines a probabilistic spatio-temporal segmentation framework with a state-of-the-art multi-view graph-cut reconstruction algorithm, thus providing full control of the stereoscopic settings at render time. Results with studio sequences of complex human motion demonstrate the suitability of the method for high quality stereoscopic content generation with minimum user interaction.

[1]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  L. Lipton Foundations of the Stereoscopic Cinema , 1982 .

[4]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[5]  Jean-Yves Guillemaut,et al.  Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[8]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[9]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, ACM Trans. Graph..

[10]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[11]  Thomas Sikora,et al.  Super-Resolution Stereo- and Multi-View Synthesis from Monocular Video Sequences , 2007, Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007).

[12]  Aljoscha Smolic,et al.  An Overview of 3D Video and Free Viewpoint Video , 2009, CAIP.

[13]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[14]  A. Hilton,et al.  Wide-Baseline Matte Propagation for Indoor Scenes , 2009, 2009 Conference for Visual Media Production.