论文信息 - Very low bit rate video coding using 3-D models

Very low bit rate video coding using 3-D models

Acknowledgments I would like to thank my supervisor, Prof. Dr.–Ing. Bernd Girod, who gave me the opportunity to join his group and to work in a stimulative and supportive atmosphere. I would also like to thank Prof. Dr. Günther Greiner for his interest in my work and for reviewing my thesis. During my doctoral studies, I had the pleasure to interact with a large number of excellent researchers and skilled engineers. I am very thankful to my colleagues and friends and Thomas Wiegand for the many stimulating discussions, joint work, and proofreading. Furthermore, I would like to thank the many others around the Telecommunications Laboratory in Erlangen who made my stay here so enjoyable. I am especially grateful to Ursula Arnold for her invaluable administrative support. Last but not least, I would like to thank my friends and my family for their valuable suggestions and support. vii Contents Notation xi Abbreviations and Acronyms xv 1 Introduction 1 1.

Peter Eisert | P. Eisert

[1] Yasuhiko Watanabe,et al. A trigonal prism-based method for hair image generation , 1992, IEEE Computer Graphics and Applications.

[2] Masahide Kaneko,et al. Coding of facial image sequence based on a 3-D model of the head and motion detection , 1991, J. Vis. Commun. Image Represent..

[3] Rama Chellappa,et al. Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[4] J. Ahlberg. Extraction and Coding of Face Model Parameters , 1999 .

[5] David R. Forsey,et al. Hierarchical B-spline refinement , 1988, SIGGRAPH.

[6] Gérard G. Medioni,et al. Detection of Intensity Changes with Subpixel Accuracy Using Laplacian-Gaussian Masks , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Yoshio Nagashima,et al. Image analysis for face modeling and facial image reconstruction , 1990, Other Conferences.

[8] Ajit Singh,et al. An estimation-theoretic framework for image-flow computation , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[9] C. R. Moloney. Methods for illumination-independent processing of digital images , 1991, [1991] IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings.

[10] Bernd Girod,et al. Rate-constrained motion estimation , 1994, Other Conferences.

[11] Yair Shoham,et al. Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[12] Marian Stewart Bartlett,et al. Classifying Facial Action , 1995, NIPS.

[13] A. Yuille. Deformable Templates for Face Recognition , 1991, Journal of Cognitive Neuroscience.

[14] Roger Y. Tsai,et al. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[15] Takeo Kanade,et al. Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16] Hiroshi Masuda,et al. Data embedding algorithms for geometrical and non-geometrical targets in three-dimensional polygonal models , 1998, Comput. Commun..

[17] Werner Blohm,et al. Lightness determination at curved surfaces with applications to dynamic range compression and model-based coding of facial images , 1997, IEEE Trans. Image Process..

[18] Tony F. Chan,et al. An Improved Algorithm for Computing the Singular Value Decomposition , 1982, TOMS.

[19] Haibo Li,et al. Two-view facial movement estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[20] Marc Levoy,et al. Light field rendering , 1996, SIGGRAPH.

[21] Thomas Wiegand,et al. Long-term memory motion-compensated prediction , 1999, IEEE Trans. Circuits Syst. Video Technol..

[22] Chil-Woo Lee,et al. Detection and Pose Estimation of Human Face with Multiple Model Images (Special Issue on Computer Vision) , 1994 .

[23] H C Longuet-Higgins,et al. The visual ambiguity of a moving plane , 1984, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[24] S. Negahdaripour,et al. Relaxing the Brightness Constancy Assumption in Computing Optical Flow , 1987 .

[25] Jechang Jeong,et al. Three-dimensional mesh warping for natural eye-to-eye contact in Internet video communication , 2000, Visual Communications and Image Processing.

[26] Peter Eisert,et al. Speech Driven Synthesis of Talking Head Sequences , 1997 .

[27] Walter T Welford,et al. Aberrations of the symmetrical optical system , 1974 .

[28] Katsushi Ikeuchi,et al. Reflectance Analysis for 3D Computer Graphics Model Generation , 1996, CVGIP Graph. Model. Image Process..

[29] Reinhard Koch,et al. Dynamic 3-D Scene Analysis Through Synthesis Feedback Control , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[30] H. Harashima,et al. Model-Based Analysis Synthesis Coding of Videotelephone Images--Conception and Basic Study of Intelligent Image Coding-- , 1989 .

[31] Roberto Brunelli,et al. Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[32] Shoji Tominaga,et al. Estimating Reflection Parameters from a Single Color Image , 2000, IEEE Computer Graphics and Applications.

[33] Belle L. Tseng,et al. Multiviewpoint video coding with MPEG-2 compatibility , 1996, IEEE Trans. Circuits Syst. Video Technol..

[34] Jörn Ostermann. Object-based analysis-synthesis coding (OBASC) based on the source model of moving flexible 3-D objects , 1994, IEEE Trans. Image Process..

[35] Soo-Chang Pei,et al. Global motion estimation in model-based image coding by tracking three-dimensional contour feature points , 1998, IEEE Trans. Circuits Syst. Video Technol..

[36] Tomaso A. Poggio,et al. Motion Field and Optical Flow: Qualitative Properties , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[37] Gary J. Sullivan,et al. Motion compensation for video compression using control grid interpolation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[38] Thomas Ertl,et al. Computer Graphics - Principles and Practice, 3rd Edition , 2014 .

[39] Bui Tuong Phong. Illumination for computer generated pictures , 1975, Commun. ACM.

[40] Matthew Turk,et al. A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[41] Alex Pentland,et al. Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[42] Irfan Essa,et al. Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[43] Aggelos K. Katsaggelos,et al. Fast and efficient mode and quantizer selection in the rate distortion sense for H.263 , 1996, Other Conferences.

[44] Henrique S. Malvar,et al. Making Faces , 2019, Topoi.

[45] M. Carter. Computer graphics: Principles and practice , 1997 .

[46] B K Horn,et al. Calculating the reflectance map. , 1979, Applied optics.

[47] Jörn Ostermann,et al. Automatic adaptation of a face model in a layered coder with an object-based analysis-synthesis layer and a knowledge-based layer , 1997, Signal Process. Image Commun..

[48] Sanjit K. Mitra,et al. Efficient mode selection for block-based motion compensated video coding , 1995, Proceedings., International Conference on Image Processing.

[49] Harvey J. Everett. Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources , 1963 .

[50] Thomas S. Huang,et al. Motion and structure from feature correspondences: a review , 1994, Proc. IEEE.

[51] Jürgen Stauder. Schätzung der Szenenbeleuchtung aus Bewegtbildfolgen , 1999 .

[52] John Lewis,et al. Automated lip-sync: Background and techniques , 1991, Comput. Animat. Virtual Worlds.

[53] Wilfried Enkelmann,et al. Investigations of multigrid algorithms for the estimation of optical flow fields in image sequences , 1988, Comput. Vis. Graph. Image Process..

[54] Charles L. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.

[55] Peter Eisert,et al. Model-based Coding of Facial Image Sequences at Varying Illumination Conditions , 1998 .

[56] Kosuke Sato,et al. Determining Reflectance Properties of an Object Using Range and Brightness Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[57] Kiyoharu Aizawa,et al. Analysis and synthesis of facial image sequences in model-based image coding , 1994, IEEE Trans. Circuits Syst. Video Technol..

[58] C. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.

[59] Gunther Wyszecki,et al. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .

[60] Roy Hall,et al. Illumination and Color in Computer Generated Imagery , 1988, Monographs in Visual Communication.

[61] Ken Shoemake,et al. Animating rotation with quaternion curves , 1985, SIGGRAPH.

[62] Dimitris N. Metaxas,et al. The integration of optical flow and deformable models with applications to human face shape and motion estimation , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63] Peter Eisert,et al. Model-based estimation of facial expression parameters from image sequences , 1997, Proceedings of International Conference on Image Processing.

[64] Subhasis Chaudhuri,et al. Recursive estimation of illuminant motion from flow field , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[65] D. E. Pearson,et al. Developments in model-based video coding , 1995, Proc. IEEE.

[66] Hans-Peter Seidel,et al. Faster evaluation of quadratic bi-variate DMS spline surfaces , 1994 .

[67] Takeo Kanade,et al. Surface Reflection: Physical and Geometrical Perspectives , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[68] R. Ladner. Entropy-constrained Vector Quantization , 2000 .

[69] Andy C. Downton,et al. A switched model-based coder for video signals , 1994, IEEE Trans. Circuits Syst. Video Technol..

[70] Haibo Li,et al. Image sequence coding at very low bit rates: a review , 1994, IEEE Trans. Image Process..

[71] Christophe Schlick,et al. A Survey of Shading and Reflectance Models , 1994, Comput. Graph. Forum.

[72] Thomas S. Huang,et al. Automatic construction of 3D human face models based on 2D images , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[73] R. RaoGeorgia,et al. EXPLOITING AUDIO-VISUAL CORRELATION IN CODING OFTALKING HEAD SEQUENCESRam , 1996 .

[74] Marc Rioux,et al. Color Reflectance Modeling Using a Polychromatic Laser Range Sensor , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[75] David A. Forsyth,et al. Reflections on Shading , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[76] Parke,et al. Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[77] JU SHANONX.. RECOGNIZING HUMAN MOTION USING PARAMETERIZED MODELS OF OPTICAL FLOW , 2022 .

[78] Peter Eisert,et al. Model-aided coding: a new approach to incorporate facial animation into motion-compensated video coding , 2000, IEEE Trans. Circuits Syst. Video Technol..

[79] Matthew Stone,et al. An anthropometric face model using variational techniques , 1998, SIGGRAPH.

[80] Heinrich H Bülthoff,et al. Why the visual recognition system might encode the effects of illumination , 1998, Vision Research.

[81] C. Hjortsjö. Man's face and mimic language , 1969 .

[82] Pat Hanrahan,et al. A realistic camera model for computer graphics , 1995, SIGGRAPH.

[83] Berthold K. P. Horn,et al. Determining Optical Flow , 1981, Other Conferences.

[84] M.G. Bellanger,et al. Digital processing of speech signals , 1980, Proceedings of the IEEE.

[85] J. Salz,et al. Algorithms for estimation of three-dimensional motion , 1985, AT&T Technical Journal.

[86] Richard Szeliski,et al. The lumigraph , 1996, SIGGRAPH.

[87] Janusz Konrad,et al. Motion estimation and compensation under varying illumination , 1994, Proceedings of 1st International Conference on Image Processing.

[88] F. F. Soulié,et al. Connectionists Methods for Human Face Processing , 1998 .

[89] Peter Eisert,et al. Rate-distortion-efficient video compression using a 3-D head model , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[90] Peter Eisert,et al. Digital watermarking of MPEG-4 facial animation parameters , 1998, Comput. Graph..

[91] Ioannis Pitas,et al. Extraction of facial regions and features using color and shape information , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[92] Eero P. Simoncelli. Design of multi-dimensional derivative filters , 1994, Proceedings of 1st International Conference on Image Processing.

[93] Kiyoharu Aizawa,et al. Model-based image coding , 1994, Other Conferences.

[94] Demetri Terzopoulos,et al. Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[95] Gene H. Golub,et al. Matrix computations , 1983 .

[96] Touradj Ebrahimi,et al. Dynamic video coding-an overview , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[97] L. Farkas. Anthropometry of the head and face , 1994 .

[98] Timothy A. Clarke,et al. Comparison of some techniques for the subpixel location of discrete target images , 1994, Other Conferences.

[99] E. Land,et al. Lightness and retinex theory. , 1971, Journal of the Optical Society of America.

[100] Nikolaos Grammalidis,et al. Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation , 1997, IEEE Trans. Circuits Syst. Video Technol..

[101] Masahide Kaneko,et al. Interactive Model-based Coding for Multimedia E-mail Environment , 1996 .

[102] Kiyoharu Aizawa,et al. An intelligent facial image coding driven by speech and phoneme , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[103] Peter Eisert,et al. Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[104] C. Cacou. Anthropometry of the head and face , 1995 .

[105] Kiyoharu Aizawa,et al. Model-based analysis synthesis image coding (MBASIC) system for a person's face , 1989, Signal Process. Image Commun..

[106] Fabio Lavagetto,et al. Lip motion modeling and speech driven estimation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[107] Roger Mohr,et al. Accuracy in image measure , 1994, Other Conferences.

[108] Michael J. Black,et al. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[109] Jürgen Stauder. Illumination analysis for synthetic/natural hybrid image sequence generation , 1998, Proceedings. Computer Graphics International (Cat. No.98EX149).

[110] Peter Eisert,et al. Model-based 3D-motion estimation with illumination compensation , 1997 .

[111] Rama Chellappa,et al. Estimation of Illuminant Direction, Albedo, and Shape from Shading , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[112] Keith Waters,et al. A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[113] Pertti Roivainen,et al. 3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[114] Y. J. Tejwani,et al. Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[115] Sanjit K. Mitra,et al. Rate-distortion optimized mode selection for very low bit rate video coding and the emerging H.263 standard , 1996, IEEE Trans. Circuits Syst. Video Technol..

[116] Kiyoharu Aizawa,et al. Model-based image coding advanced video coding techniques for very low bit-rate applications , 1995, Proc. IEEE.

[117] 大野義夫,et al. Computer Graphics : Principles and Practice, 2nd edition, J.D. Foley, A.van Dam, S.K. Feiner, J.F. Hughes, Addison-Wesley, 1990 , 1991 .

[118] Roberto Brunelli,et al. Estimation of pose and illuminant direction for face processing , 1994, Image Vis. Comput..

[119] K. Torrance,et al. Theory for off-specular reflection from roughened surfaces , 1967 .

[120] Michael G. Strintzis,et al. Object-based coding of stereo image sequences using three-dimensional models , 1997, IEEE Trans. Circuits Syst. Video Technol..

[121] A. Pentland. Finding the illuminant direction , 1982 .

[122] Skjalg Lepsøy,et al. Conversion of articulatory parameters into active shape model coefficients for lip motion representation and synthesis , 1998, Signal Process. Image Commun..

[123] Dimitris N. Metaxas,et al. Deformable model-based shape and motion analysis from images using motion residual error , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[124] Timothy F. Cootes,et al. Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[125] Shahriar Negahdaripour,et al. A generalized brightness change model for computing optical flow , 1993, 1993 (4th) International Conference on Computer Vision.

[126] Paul R. Cohen,et al. Camera Calibration with Distortion Models and Accuracy Evaluation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[127] Eric Dubois,et al. Estimation of motion fields from image sequences with illumination variation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[128] Keith Waters,et al. Computer Facial Animation, Second Edition , 1996 .

[129] Alex Pentland. Photometric Motion , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[130] Ingemar J. Cox,et al. Predicting and Estimating the Accuracy of a Subpixel Registration Algorithm , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[131] J. Cohen,et al. Color Science: Concepts and Methods, Quantitative Data and Formulas , 1968 .

[132] Alex Pentland,et al. A vision system for observing and extracting facial action parameters , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[133] Bülent Sankur,et al. Facial feature localization and adaptation of a generic face model for model-based coding , 1995, Signal Process. Image Commun..

[134] Itu-T. Video coding for low bitrate communication , 1996 .

[135] A. Murat Tekalp,et al. 3-D motion estimation and wireframe adaptation including photometric effects for model-based coding of facial image sequences , 1994, IEEE Trans. Circuits Syst. Video Technol..

[136] Thomas S. Huang,et al. 3D head pose computation from 2D images: templates versus features , 1995, Proceedings., International Conference on Image Processing.

[137] Daniel Thalmann,et al. Virtual Human Representation and Communication in VLNet , 1997, IEEE Computer Graphics and Applications.

[138] Haibo Li,et al. Representing and compressing facial animation parameters using facial action basis functions , 1999, IEEE Trans. Circuits Syst. Video Technol..

[139] Demetri Terzopoulos,et al. Realistic modeling for facial animation , 1995, SIGGRAPH.

[140] Frederic Dufaux,et al. Motion estimation techniques for digital TV: a review and a new contribution , 1995, Proc. IEEE.

[141] Charles A. Poynton,et al. Gamma and Its Disguises : The Nonlinear Mappings of Intensity in Perception, CRTs, Film, and Video , 1993 .

[142] Ken-ichi Anjyo,et al. A simple method for extracting the natural beauty of hair , 1992, SIGGRAPH.

[143] Alex Pentland,et al. Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[144] Yao Wang,et al. Speech-assisted lip synchronization in audio-visual communications , 1995, Proceedings., International Conference on Image Processing.

[145] Thomas S. Huang,et al. Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[146] 오승준. [서평]「Digital Video Processing」 , 1996 .

[147] Michael J. Black,et al. The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[148] Bernd Girod,et al. Image sequence coding using 3D scene models , 1994, Other Conferences.

[149] Jürgen Stauder,et al. Estimation of point light source parameters for object-based coding , 1995, Signal Process. Image Commun..

[150] Horace Ho-Shing Ip,et al. Script-based facial gesture and speech animation using a NURBS based face model , 1996, Comput. Graph..

[151] Thomas Vetter,et al. Estimating Coloured 3D Face Models from Single Images: An Example Based Approach , 1998, ECCV.

[152] Katsushi Ikeuchi,et al. Object shape and reflectance modeling from observation , 1997, SIGGRAPH.

[153] Azriel Rosenfeld,et al. Improved Methods of Estimating Shape from Shading Using the Light Source Coordinate System , 1985, Artif. Intell..

[154] G. A. Thomas,et al. Television motion measurement for DATV and other applications , 1987 .

[155] Hans Georg Musmann. A layered coding system for very low bit rate video coding , 1995, Signal Process. Image Commun..

[156] Marian Stewart Bartlett,et al. Classifying Facial Actions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[157] Adam Finkelstein,et al. Robust mesh watermarking , 1999, SIGGRAPH.

[158] Oliver Benedens,et al. Geometry-Based Watermarking of 3D Models , 1999, IEEE Computer Graphics and Applications.

[159] Gary J. Sullivan,et al. Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[160] Roberto Cipolla,et al. Determining the gaze of faces in images , 1994, Image Vis. Comput..