Efficient compression of arbitrary multi-view video signals

Multiple views of a scene, obtained from cameras positioned at distinct viewpoints, can provide a viewer with the benefits of added realism, selective viewing, and improved scene understanding. The importance of these signals is evidenced by the recently proposed Multi-View Profile (MVP) extension to the MPEG-2 video compression standard, and their explicit incorporation into the future MPEG-4 standard. However, multi-view compression implementations typically rely on single-view image sequence model assumptions. We hypothesize (and demonstrate) that impressive system bandwidth reduction can be achieved by utilizing displacement vector field and image intensity models tuned to the special characteristics of multi-view video signals. This thesis focuses on the predictive coding of non-periodic, i.e., arbitrary, multi-view video signals for the applications of simulated motion parallax and viewer-specified degree of stereoscopy. To facilitate their practical use, we desire algorithms that are applicable to the common waveform-based, hybrid encoder framework, which consists of a frame-based prediction followed by residual encoding. Three novel techniques are developed, which respectively improve the processes of framebased prediction, residual encoding, and viewpoint interpolation. These are: • a simple method to adaptively select the best possible reference frame, based on estimated occlusion percentage with the frame to be encoded; • a low bit rate residual encoding technique that compensates for pixel intensity nonstationarities along a displacement trajectory and for the practical limitations of the prediction process; and • an algorithm that correctly handles displacement estimation errors, occlusions and ambiguously-referenced image regions for the interpolation of subjectively-pleasing “virtual” viewpoints from a noisy displacement vector field. We demonstrate the superiority of each of these algorithms on numerous multi-view video signals through comparisons with conventional techniques, and we analyze their cost/benefit ratio in terms of increases in system complexity and storage, offset by rate-distortion improvements. Finally, we indicate the relative significance of these algorithms, and provide insight into how and when they should be combined into a complete, efficient multi-view encoder/decoder system.

[1]  Philip A. Chou,et al.  Variable rate vector quantization for speech, image, and video compression , 1993, IEEE Trans. Commun..

[2]  Jeffrey S. McVeigh,et al.  Vector restoration for video coding , 1995, Proceedings., International Conference on Image Processing.

[3]  M. Lukacs,et al.  Predictive coding of multi-viewpoint image sets , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Stefano Tubaro,et al.  Motion compensated image interpolation , 1990, IEEE Trans. Commun..

[5]  Anurag Bist,et al.  of Book , 2022 .

[6]  J. N. Mailhot,et al.  The Grand Alliance HDTV video encoder , 1995 .

[7]  Jeffrey S. McVeigh,et al.  Partial closed‐loop versus open‐loop motion estimation for HDTV compression , 1994, Int. J. Imaging Syst. Technol..

[8]  A.K. Jain,et al.  Advances in mathematical models for image processing , 1981, Proceedings of the IEEE.

[9]  Mel W. Siegel,et al.  A multiresolution framework for stereoscopic image sequence compression , 1994, Proceedings of 1st International Conference on Image Processing.

[10]  Kenneth Zeger,et al.  Universal adaptive vector quantization using codebook quantization with application to image compression , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[12]  Jeffrey S. McVeigh,et al.  Adaptive reference frame selection for generalized video signal coding , 1996, Electronic Imaging.

[13]  Eric Dubois,et al.  Coding image sequence intensities along motion trajectories using EC-CELP quantization , 1994, Proceedings of 1st International Conference on Image Processing.

[14]  Avishai Henik,et al.  On the compression of stereo images: Preliminary results , 1989 .

[15]  Allen Gersho,et al.  Feature predictive vector quantization of multispectral images , 1992, IEEE Trans. Geosci. Remote. Sens..

[16]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[17]  V. Michael Bove,et al.  Segmentation of frames in a video sequence using motion and other attributes , 1995, Electronic Imaging.

[18]  S. Pastoor,et al.  Subjective assessments of the resolution of viewing directions in a multi-viewpoint 3D TV system , 1989 .

[19]  James W. Modestino,et al.  Adaptive entropy-coded pruned tree-structured predictive vector quantization of images , 1993, IEEE Trans. Commun..

[20]  Jeffrey S. McVeigh,et al.  Algorithm for automated eye-strain reduction in real stereoscopic images and sequences , 1996, Electronic Imaging.

[21]  David L. Neuhoff,et al.  Optimal bit allocations for lossless video coders: motion vectors vs. difference frames , 1995, Proceedings., International Conference on Image Processing.

[22]  Toshiaki Fujii,et al.  Data compression of an autostereoscopic 3D image , 1994, Electronic Imaging.

[23]  Philip A. Chou,et al.  Optimal pruning with applications to tree-structured source coding and modeling , 1989, IEEE Trans. Inf. Theory.

[24]  M. Effros,et al.  Weighted universal transform coding: universal image compression with the Karhunen-Loeve transform , 1995, Proceedings., International Conference on Image Processing.

[25]  E. Dubois,et al.  Rate-distortion performance of source coders in the low bit-rate region for highly correlated Gauss-Markov source , 1993, Proceedings of GLOBECOM '93. IEEE Global Telecommunications Conference.

[26]  T. Uchida,et al.  Cause of fatigue and its improvement in stereoscopic displays , 1990 .

[27]  Siegmund Pastoor,et al.  3D-television: A survey of recent research results on subjective requirements , 1991, Signal Process. Image Commun..

[28]  Takeo Kanade,et al.  A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Chaur-Heh Hsieh,et al.  Frame adaptive finite-state vector quantization for image sequence coding , 1995, Signal Process. Image Commun..

[30]  T. Yamazaki,et al.  Quantitative evaluation of visual fatigue encountered in viewing stereoscopic 3D displays : near-point distance and visual evoked potential study , 1990 .

[31]  Colin Ware,et al.  Algorithm for dynamic disparity adjustment , 1995, Electronic Imaging.

[32]  Bernd Girod,et al.  The Efficiency of Motion-Compensating Prediction for Hybrid Coding of Video Sequences , 1987, IEEE J. Sel. Areas Commun..

[33]  S. Pastoor,et al.  Visibility thresholds for disparity quantization errors in stereoscopic displays , 1991 .

[34]  José M. F. Moura,et al.  3-D video compositing: towards a compact representation for video sequences , 1995, Proceedings., International Conference on Image Processing.

[35]  Luís Corte-Real,et al.  A very low bit rate video coder based on vector quantization , 1996, IEEE Trans. Image Process..

[36]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[37]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[38]  A. Schertz Source coding of stereoscopic television pictures , 1992 .

[39]  Allen Gersho,et al.  Nonlinear Predictive Vector Quantization of Multispectral Imagery , 1990, 1990 Conference Record Twenty-Fourth Asilomar Conference on Signals, Systems and Computers, 1990..

[40]  Nikhil Balram,et al.  Recursive structure of noncausal Gauss-Markov random fields , 1992, IEEE Trans. Inf. Theory.

[41]  Enrico Grosso,et al.  Active/Dynamic Stereo Vision , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  N.M. Nasrabadi,et al.  Predictive residual vector quantization [image coding] , 1995, IEEE Trans. Image Process..

[43]  Michael G. Perkins,et al.  Data compression of stereopairs , 1992, IEEE Trans. Commun..

[44]  Avishai Henik,et al.  Compression of stereo images using subsampling and transform coding , 1991 .

[45]  Christopher D. Wickens,et al.  Three-dimensional stereoscopic display implementation: guidelines derived from human visual capabilities , 1990, Other Conferences.

[46]  William B. Thompson,et al.  Disparity Analysis of Images , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Nasser M. Nasrabadi,et al.  Predictive residual vector quantization , 1994, Proceedings of 1st International Conference on Image Processing.

[48]  Dimitrios Tzovaras,et al.  Joint three-dimensional motion/disparity segmentation for object-based stereo image sequence coding , 1996 .

[49]  Hiroshi Harashima,et al.  View interpolation using epipolar plane images , 1994, Proceedings of 1st International Conference on Image Processing.

[50]  Paul William Richardson Image restoration using vector classified adaptive filtering , 1993, Other Conferences.

[51]  Cordelia Schmid,et al.  Auto-calibration by direct observation of objects , 1993, Image Vis. Comput..

[52]  N. Balram,et al.  Noncausal Gauss Markov random fields: Parameter structure and estimation , 1993, IEEE Trans. Inf. Theory.

[53]  Jin Liu,et al.  A three camera approach for calculating disparity and synthesizing intermediate pictures , 1991, Signal Process. Image Commun..

[54]  Kevin W. Bowyer,et al.  Computing the Perspective Projection Aspect Graph of Solids of Revolution , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Nariman Farvardin,et al.  Rate-distortion performance of DPCM schemes for autoregressive sources , 1985, IEEE Trans. Inf. Theory.

[56]  Danielle Pelé,et al.  A stereoscopic television system (3D-TV) and compatible transmission on a MAC channel (3D-MAC) , 1991, Signal Process. Image Commun..

[57]  R. R. Clarke Transform coding of images , 1985 .

[58]  Kyu Tae Park,et al.  Wavelet transform image compression using human visual characteristics and a tree structure with a height attribute , 1996 .

[59]  Guy Demoment,et al.  Image reconstruction and restoration: overview of common estimation structures and problems , 1989, IEEE Trans. Acoust. Speech Signal Process..

[60]  Rae-Hong Park,et al.  Local motion-adaptive interpolation technique based on block matching algorithms , 1992, Signal Process. Image Commun..

[61]  Xiaolin Wu,et al.  A segmentation-based predictive multiresolution image coder , 1995, IEEE Trans. Image Process..

[62]  Jeffrey S. McVeigh,et al.  Intermediate view synthesis considering occluded and ambiguously referenced image regions , 1996, Signal Process. Image Commun..

[63]  Ahmed Tamtaoui,et al.  Constrained disparity and motion estimators for 3DTV image sequence coding , 1991, Signal Process. Image Commun..

[64]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[65]  Anil K. Jain,et al.  Displacement Measurement and Its Application in Interframe Image Coding , 1981, IEEE Trans. Commun..

[66]  José M. F. Moura,et al.  Image codec by noncausal prediction, residual mean removal, and cascaded VQ , 1996, IEEE Trans. Circuits Syst. Video Technol..

[67]  Monson H. Hayes,et al.  Compression of multi-view images , 1994, Proceedings of 1st International Conference on Image Processing.

[68]  Martial Hebert,et al.  Stereo perception and dead reckoning for a prototype lunar rover , 1995, Auton. Robots.

[69]  Charles Hansen,et al.  Efficient Depth Estimation Using Trinocular Stereo , 1989, Optics East.

[70]  G.S. Moschytz,et al.  Practical fast 1-D DCT algorithms with 11 multiplications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[71]  Petros Maragos,et al.  CODING OF IMAGES , 1982 .

[72]  Robert THOMA,et al.  Motion compensating interpolation considering covered and uncovered background , 1989, Signal Process. Image Commun..

[73]  Martin Vetterli,et al.  Adaptive quantization without side information , 1994, Proceedings of 1st International Conference on Image Processing.

[74]  B. Julesz Foundations of Cyclopean Perception , 1971 .

[75]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[76]  W. Zschunke DPCM Picture Coding with Adaptive Prediction , 1977, IEEE Trans. Commun..

[77]  R. Hopkins Digital terrestrial HDTV for North America: the Grand Alliance HDTV system , 1994 .

[78]  Avideh Zakhor,et al.  Edge-based 3-D camera motion estimation with application to video coding , 1993, IEEE Trans. Image Process..

[79]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[80]  Rama Chellappa,et al.  A computational vision approach to image registration , 1993, IEEE Trans. Image Process..

[81]  Steven K. Feiner,et al.  Introduction to Computer Graphics , 1993 .

[82]  Hiroyuki Yamaguchi,et al.  Stereoscopic images disparity for predictive coding , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[83]  Jun'ichi Takeno,et al.  Stereovision systems for autonomous mobile robots , 1996, Robotics Auton. Syst..

[84]  Jordi Ribas-Corbera,et al.  Interframe Interpolation of Cinematic Sequences , 1993, J. Vis. Commun. Image Represent..

[85]  Barry G. Haskell,et al.  Stereoscopic video compression using temporal scalability , 1995, Other Conferences.

[86]  KanadeT.,et al.  A Stereo Matching Algorithm with an Adaptive Window , 1994 .

[87]  Hideyuki Tamura,et al.  Viewpoint-dependent stereoscopic display using interpolation of multiviewpoint images , 1995, Electronic Imaging.

[88]  Chung-Lin Huang,et al.  Motion‐compensated interpolation for scan rate up‐conversion , 1996 .

[89]  Michael W. Marcellin,et al.  Compression of hyperspectral imagery using the 3-D DCT and hybrid DPCM/DCT , 1995, IEEE Trans. Geosci. Remote. Sens..

[90]  Cliff Reader MPEG4: coding for content, interactivity, and universal accessibility , 1996 .

[91]  T. K. Tan,et al.  Optimum loop filter in hybrid coders , 1994, IEEE Trans. Circuits Syst. Video Technol..

[92]  Gregg Podnar,et al.  Geometry of binocular imaging , 1994, Electronic Imaging.

[93]  Ricardo L. de Queiroz,et al.  Variable block size lapped transforms , 1995, Proceedings., International Conference on Image Processing.

[94]  Jeffrey S. McVeigh,et al.  Double-buffering technique for binocular imaging in a window , 1995, Electronic Imaging.

[95]  Jin Liu,et al.  Stereo and motion correspondence in a sequence of stereo images , 1993, Signal Process. Image Commun..

[96]  Hirohisa Yamaguchi Multifocus synthesis and its application to 3D image capturing , 1993, Other Conferences.

[97]  Nagaraj Nandhakumar,et al.  An Improved Power Cepstrum Based Stereo Correspondence Method for Textured Scenes , 1996, IEEE Trans. Pattern Anal. Mach. Intell..