Wavelet-Based Joint Estimation and Encoding of Depth-Image-Based Representations for Free-Viewpoint Rendering

We propose a wavelet-based codec for the static depth-image-based representation, which allows viewers to freely choose the viewpoint. The proposed codec jointly estimates and encodes the unknown depth map from multiple views using a novel rate-distortion (RD) optimization scheme. The rate constraint reduces the ambiguity of depth estimation by favoring piece- wise-smooth depth maps. The optimization is efficiently solved by a novel dynamic programming along trees of integer wavelet coefficients. The codec encodes the image and the depth map jointly to decrease their redundancy and to provide a RD-optimized bitrate allocation between the two. The codec also offers scalability both in resolution and in quality. Experiments on real data show the effectiveness of the proposed codec.

[1]  Rae-Hong Park,et al.  Trinocular stereo sequence coding based on MPEG-2 , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Aljoscha Smolic,et al.  Interactive 3-D Video Representation and Coding Technologies , 2005, Proceedings of the IEEE.

[3]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Michael G. Strintzis,et al.  Motion and disparity field estimation using rate-distortion optimization , 1998, IEEE Trans. Circuits Syst. Video Technol..

[5]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[6]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[7]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[8]  Harry Shum,et al.  The plenoptic video , 2005, IEEE Trans. Circuits Syst. Video Technol..

[9]  James E. Fowler QccPack: an open-source software library for quantization, compression, and coding , 2000, Proceedings DCC 2000. Data Compression Conference.

[10]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[11]  Minh N. Do,et al.  Rate-Distortion Optimal Depth Maps in the Wavelet Domain for Free-Viewpoint Rendering , 2007, 2007 IEEE International Conference on Image Processing.

[12]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[13]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[14]  S. Mallat A wavelet tour of signal processing , 1998 .

[15]  Antonio Ortega,et al.  Rate-distortion methods for image and video compression , 1998, IEEE Signal Process. Mag..

[16]  Yan Yang,et al.  Generalized rate-distortion optimization for motion-compensated video coders , 2000, IEEE Trans. Circuits Syst. Video Technol..

[17]  S. B. Kang,et al.  Survey of image-based representations and compression techniques , 2003, IEEE Trans. Circuits Syst. Video Technol..

[18]  Luce Morin,et al.  Scalable and Efficient Video Coding Using 3-D Modeling , 2006, IEEE Transactions on Multimedia.

[19]  Gary J. Sullivan,et al.  Efficient quadtree coding of images and video , 1994, IEEE Trans. Image Process..

[20]  H. Park,et al.  A mesh-based disparity representation method for view interpolation and stereo image compression , 2006, IEEE Transactions on Image Processing.

[21]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[22]  In Kyu Park,et al.  Depth image-based representation and compression for static and animated 3-D objects , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[24]  Yao Wang,et al.  Multiview video sequence analysis, compression, and virtual viewpoint synthesis , 2000, IEEE Trans. Circuits Syst. Video Technol..

[25]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[26]  C. Fehn,et al.  Interactive 3-DTV-Concepts and Key Technologies , 2006 .

[27]  J. N. Ellinas,et al.  Stereo video coding based on quad-tree decomposition of B– P frames by motion and disparity interpolation , 2005 .

[28]  Bernd Girod,et al.  Rate-distortion analysis for light field coding and streaming , 2006, Signal Process. Image Commun..

[29]  Rae-Hong Park,et al.  Reconstruction of intermediate views from stereoscopic images using disparity vectors estimated by the geometrical constraint , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[31]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[32]  Marcus A. Magnor,et al.  Multi-view coding for image-based rendering using 3-D scene geometry , 2003, IEEE Trans. Circuits Syst. Video Technol..

[33]  W. Press,et al.  Numerical Recipes in Fortran: The Art of Scientific Computing.@@@Numerical Recipes in C: The Art of Scientific Computing. , 1994 .

[34]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[35]  Aggelos K. Katsaggelos,et al.  An optimal quadtree-based motion estimation and motion-compensated interpolation scheme for video compression , 1998, IEEE Trans. Image Process..

[36]  Wojciech Matusik,et al.  3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes , 2004, ACM Trans. Graph..

[37]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .