Model-Based Free-Viewpoint Video Acquisition, Rendering and Encoding

In recent years, the convergence of computer vision and computer graphics has put forth free-viewpoint video as a new field of research. The goal is to advance traditional 2D video into an immersive medium that enables the viewer to interactively choose an arbitrary viewpoint in 3D space onto the a scene while it plays back. In this paper we give an overview of a system for reconstructing, rendering and encoding free-viewpoint videos of human actors. It employs a hardware-accelerated marker-free optical motion capture algorithm from multi-view video streams and an a-priori body model to reconstruct shape and motion of a moving actor. Real-time high-quality rendering of the moving person from arbitrary perspectives is achieved by applying a multi-view texturing approach from the video frames. We also present a predictive encoding as well as a 4D-SPIHT wavelet compression mechanism that both exploit the 3D scene geometry for efficient encoding of the multi-view texture images.

[1]  James E. Fowler Shape-adaptive coding using binary set splitting with k-d trees , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[2]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[3]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[4]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[5]  Hans-Peter Seidel,et al.  Multi-video compression in texture space using 4D SPIHT , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[6]  Takashi Matsuyama,et al.  Generation, visualization, and editing of 3D video , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[7]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Dani Lischinski,et al.  Bounded-distortion piecewise mesh parameterization , 2002, IEEE Visualization, 2002. VIS 2002..

[9]  Markus H. Gross,et al.  3D video recorder , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[10]  Hans-Peter Seidel,et al.  A Parallel Framework for Silhouette-Based Human Motion Capture , 2003, VMV.

[11]  Hans-Peter Seidel,et al.  Multivideo compression in texture space , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[12]  Marcus A. Magnor,et al.  Multi-view coding for image-based rendering using 3-D scene geometry , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Don Kimber,et al.  FlyAbout: spatially indexed panoramic video , 2001, MULTIMEDIA '01.

[14]  Scott S. Fisher,et al.  Stereoscopic Displays and Applications , 1990 .

[15]  Hans-Peter Seidel,et al.  Enhancing silhouette-based human motion capture with 3D motion fields , 2003, 11th Pacific Conference onComputer Graphics and Applications, 2003. Proceedings..

[16]  Christoph Fehn,et al.  Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV , 2004, IS&T/SPIE Electronic Imaging.

[17]  Hans-Peter Seidel,et al.  A Flexible and Versatile Studio for Multi-View Video Recording , 2003 .

[18]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[19]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[20]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..