The Multiple-Camera 3-D Production Studio

Multiple-camera systems are currently widely used in research and development as a means of capturing and synthesizing realistic 3-D video content. Studio systems for 3-D production of human performance are reviewed from the literature, and the practical experience gained in developing prototype studios is reported across two research laboratories. System design should consider the studio backdrop for foreground matting, lighting for ambient illumination, camera acquisition hardware, the camera configuration for scene capture, and accurate geometric and photometric camera calibration. A ground-truth evaluation is performed to quantify the effect of different constraints on the multiple-camera system in terms of geometric accuracy and the requirement for high-quality view synthesis. As changing camera height has only a limited influence on surface visibility, multiple-camera sets or an active vision system may be required for wide area capture, and accurate reconstruction requires a camera baseline of 25deg, and the achievable accuracy is 5-10-mm at current camera resolutions. Accuracy is inherently limited, and view-dependent rendering is required for view synthesis with sub-pixel accuracy where display resolutions match camera resolutions. The two prototype studios are contrasted and state-of-the-art techniques for 3-D content production demonstrated.

[1]  Takeo Kanade,et al.  The 3D Room: Digitizing Time-Varying 3D Events by Synchronized Multiple Video Streams , 1998 .

[2]  Tomás Svoboda,et al.  A Convenient Multicamera Self-Calibration for Virtual Environments , 2005, Presence: Teleoperators & Virtual Environments.

[3]  Hans-Peter Seidel,et al.  A Flexible and Versatile Studio for Synchronized Multi-View Video Recording , 2003, VVG.

[4]  O. Faugeras,et al.  Variational principles, surface evolution, PDE's, level set methods and the stereo problem , 1998, 5th IEEE EMBS International Summer School on Biomedical Imaging, 2002..

[5]  Laurent Moll,et al.  Real time correlation-based stereo: algorithm, implementations and applications , 1993 .

[6]  Yizhou Yu,et al.  Efficient View-Dependent Image-Based Rendering with Projective Texture-Mapping , 1998, Rendering Techniques.

[7]  Adrian Hilton,et al.  Wand-based Multiple Camera Studio Calibration , 2007 .

[8]  Takeo Kanade,et al.  Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking , 2005, International Journal of Computer Vision.

[9]  Jovan Popovic,et al.  Continuous capture of skin deformation , 2003, ACM Trans. Graph..

[10]  Vladimir Kolmogorov,et al.  Computing geodesics and minimal surfaces via graph cuts , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Takeo Kanade,et al.  Image-based spatio-temporal modeling and view interpolation of dynamic events , 2005, TOGS.

[13]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[14]  Francis Schmitt,et al.  Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[15]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Marcus A. Magnor,et al.  Space-time isosurface evolution for temporally coherent 3D reconstruction , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Hans-Peter Seidel,et al.  Model-Based Free-Viewpoint Video Acquisition, Rendering and Encoding , 2004 .

[18]  Xiaojun Wu,et al.  Real-time 3D shape reconstruction, dynamic 3D mesh deformation, and high fidelity visualization for 3D video , 2004, Comput. Vis. Image Underst..

[19]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[20]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[21]  Takashi Matsuyama,et al.  Real-time cooperative multi-target tracking by communicating active vision agents , 2005, Comput. Vis. Image Underst..

[22]  Markus H. Gross,et al.  Scalable 3D video of dynamic scenes , 2005, The Visual Computer.

[23]  Jean Ponce,et al.  Carved Visual Hulls for Image-Based Modeling , 2006, International Journal of Computer Vision.

[24]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[25]  Masayuki Tanimoto FTV (Free Viewpoint Television) for 3D Scene Reproduction and Creation , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[26]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Tsuhan Chen,et al.  A Self-Reconfigurable Camera Array , 2004, Rendering Techniques.

[28]  Wojciech Matusik,et al.  Polyhedral Visual Hulls for Real-Time Rendering , 2001, Rendering Techniques.

[29]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[30]  Emanuele Trucco,et al.  Rectification with unconstrained stereo geometry , 1997, BMVC.

[31]  Oliver Grau,et al.  A combined studio production system for 3-D capturing of live action and immersive actor feedback , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Emanuele Trucco,et al.  Symmetric Stereo with Multiple Windowing , 2000, Int. J. Pattern Recognit. Artif. Intell..

[33]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[34]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[36]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[37]  Paul E. Debevec,et al.  Optimizing Color Matching in a Lighting Reproduction System for Complex Subject and Illuminant Spectra , 2003, Rendering Techniques.

[38]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Yuichi Iwadate,et al.  Algorithm for dynamic 3D object generation from multi-viewpoint images , 2004, SPIE Optics East.

[40]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[41]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[43]  Olivier D. Faugeras,et al.  Variational principles, surface evolution, PDEs, level set methods, and the stereo problem , 1998, IEEE Trans. Image Process..

[44]  Takeo Kanade,et al.  Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[45]  Paolo Cignoni,et al.  Metro: Measuring Error on Simplified Surfaces , 1998, Comput. Graph. Forum.

[46]  Jérémie Allard,et al.  The GrImage Platform: A Mixed Reality Environment for Interactions , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[47]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[48]  Adrian Hilton,et al.  Virtual view synthesis of people from multiple view video sequences , 2005, Graph. Model..

[49]  François X. Sillion,et al.  A Real-Time System for Full Body Interaction with Virtual Worlds , 2004, EGVE.

[50]  Edmond Boyer,et al.  Visual Shapes of Silhouette Sets , 2006, 3DPVT.

[51]  M. Magnor,et al.  Space-time isosurface evolution for temporally coherent 3D reconstruction , 2004, CVPR 2004.

[52]  Y. Tsai Roger An Efficient and Accurate Camera Calibration Technique For 3D Machine Vision , 1986, CVPR 1986.

[53]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..