论文信息 - On Dependent Bit Allocation for Multiview Image Coding With Depth-Image-Based Rendering

On Dependent Bit Allocation for Multiview Image Coding With Depth-Image-Based Rendering

The encoding of both texture and depth maps of multiview images, captured by a set of spatially correlated cameras, is important for any 3-D visual communication system based on depth-image-based rendering (DIBR). In this paper, we address the problem of efficient bit allocation among texture and depth maps of multiview images. More specifically, suppose we are given a coding tool to encode texture and depth maps at the encoder and a view-synthesis tool to construct intermediate views at the decoder using neighboring encoded texture and depth maps. Our goal is to determine how to best select captured views for encoding and distribute available bits among texture and depth maps of selected coded views, such that the visual distortion of desired constructed views is minimized. First, in order to obtain at the encoder a low complexity estimate of the visual quality of a large number of desired synthesized views, we derive a cubic distortion model based on basic DIBR properties, whose parameters are obtained using only a small number of viewpoint samples. Then, we demonstrate that the optimal selection of coded views and quantization levels for corresponding texture and depth maps is equivalent to the shortest path in a specially constructed 3-D trellis. Finally, we show that, using the assumptions of monotonicity in the predictor's quantization level and distance, suboptimal solutions can be efficiently pruned from the feasible space during solution search. Experiments show that our proposed efficient selection of coded views and quantization levels for corresponding texture and depth maps outperforms an alternative scheme using constant quantization levels for all maps (commonly used in video standard implementations) by up to 1.5 dB. Moreover, the complexity of our scheme can be reduced by at least 80% over the full solution search.

[1] Yang Liang,et al. Variable frame skipping scheme based on estimated quality of non-coded frames at decoder for real-time block-based video coding , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[2] Aljoscha Smolic,et al. The effects of multiview depth video compression on multiview rendering , 2009, Signal Process. Image Commun..

[3] Andrea Fusiello. Image-based Rendering * , 2003 .

[4] S. Burak Gokturk,et al. A Time-Of-Flight Depth Sensor - System Description, Issues and Solutions , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[5] Jitendra Malik,et al. Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[6] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[7] E. Izquierdo,et al. Systems for disparity-based multiple-view interpolation , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[8] Peter H. N. de With,et al. Depth-Image Compression Based on an R-D Optimized Quadtree Decomposition for the Transmission of Multiview Images , 2007, 2007 IEEE International Conference on Image Processing.

[9] Ajay Luthra,et al. Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[10] Harry Shum,et al. Image-based rendering , 2006, Found. Trends Comput. Graph. Vis..

[11] Jiang Li,et al. A low complexity motion compensated frame interpolation method , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[12] Steven M. Seitz,et al. View morphing , 1996, SIGGRAPH.

[13] Minh N. Do,et al. Wavelet-Based Joint Estimation and Encoding of Depth-Image-Based Representations for Free-Viewpoint Rendering , 2008, IEEE Transactions on Image Processing.

[14] Richard Szeliski,et al. Layered depth images , 1998, SIGGRAPH.

[15] Paul Debevec,et al. Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[16] Baoxin Li,et al. Virtual view synthesis with heuristic spatial motion , 2008, 2008 15th IEEE International Conference on Image Processing.

[17] Hwangjun Song,et al. Rate control for low-bit-rate video via variable-encoding frame rates , 2001, IEEE Trans. Circuits Syst. Video Technol..

[18] Antonio Ortega,et al. Bit allocation for dependent quantization with applications to multiresolution and MPEG video coders , 1994, IEEE Trans. Image Process..

[19] Marc Levoy,et al. Light field rendering , 1996, SIGGRAPH.

[20] Tao Chen. Adaptive temporal interpolation using bidirectional motion estimation and compensation , 2002, Proceedings. International Conference on Image Processing.

[21] Antonio Ortega,et al. Depth map distortion analysis for view rendering and depth coding , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[22] Shan Liu,et al. Joint temporal-spatial bit allocation for video coding with dependency , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[23] Aljoscha Smolic,et al. Efficient Prediction Structures for Multiview Video Coding , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[24] Leonard McMillan,et al. Post-rendering 3D warping , 1997, SI3D.

[25] Chang-Su Kim,et al. Multi-view video coding with view interpolation prediction for 2D camera arrays , 2010, J. Vis. Commun. Image Represent..

[26] Antonio Ortega,et al. Depth map coding with distortion estimation of rendered view , 2010, Electronic Imaging.

[27] Aljoscha Smolic,et al. Intermediate view interpolation based on multiview video plus depth for advanced 3D video systems , 2008, 2008 15th IEEE International Conference on Image Processing.

[28] Aljoscha Smolic,et al. Multi-View Video Plus Depth Representation and Coding , 2007, 2007 IEEE International Conference on Image Processing.

[29] Gary J. Sullivan,et al. Rate-constrained coder control and comparison of video coding standards , 2003, IEEE Trans. Circuits Syst. Video Technol..

[30] Markus Flierl,et al. Motion and Disparity Compensated Coding for Multiview Video , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[31] Yair Shoham,et al. Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[32] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[33] Marcus A. Magnor,et al. Multi-view coding for image-based rendering using 3-D scene geometry , 2003, IEEE Trans. Circuits Syst. Video Technol..

[34] Toshiaki Fujii,et al. Multipoint Measuring System for Video and Sound - 100-camera and microphone system , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[35] DariboIsmaël,et al. Motion vector sharing and bitrate allocation for 3D video-plus-depth coding , 2008 .

[36] Peter H. N. de With,et al. Multiview Depth-Image Compression Using an Extended H.264 Encoder , 2007, ACIVS.

[37] N. Atzpadin,et al. Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability , 2007, Signal Process. Image Commun..

[38] S. B. Kang,et al. Survey of image-based representations and compression techniques , 2003, IEEE Trans. Circuits Syst. Video Technol..

[39] Christophe Tillier,et al. Motion Vector Sharing and Bitrate Allocation for 3D Video-Plus-Depth Coding , 2009, EURASIP J. Adv. Signal Process..

[40] Antonio Ortega,et al. Dependent bit allocation in multiview video coding , 2005, IEEE International Conference on Image Processing 2005.

[41] Gene Cheung,et al. Efficient bit allocation for multiview image coding & view synthesis , 2010, 2010 IEEE International Conference on Image Processing.

[42] Mark A. Horowitz,et al. Light field video camera , 2000, IS&T/SPIE Electronic Imaging.

[43] Qingming Huang,et al. Joint video/depth rate allocation for 3D video coding based on view synthesis distortion model , 2009, Signal Process. Image Commun..

[44] Richard Szeliski,et al. The lumigraph , 1996, SIGGRAPH.

[45] Gene Cheung,et al. Reference Frame Optimization for Multiple-Path Video Streaming With Complexity Scaling , 2007, IEEE Transactions on Circuits and Systems for Video Technology.