Very low bit rate video coding using 3-D models

Acknowledgments I would like to thank my supervisor, Prof. Dr.–Ing. Bernd Girod, who gave me the opportunity to join his group and to work in a stimulative and supportive atmosphere. I would also like to thank Prof. Dr. Günther Greiner for his interest in my work and for reviewing my thesis. During my doctoral studies, I had the pleasure to interact with a large number of excellent researchers and skilled engineers. I am very thankful to my colleagues and friends and Thomas Wiegand for the many stimulating discussions, joint work, and proofreading. Furthermore, I would like to thank the many others around the Telecommunications Laboratory in Erlangen who made my stay here so enjoyable. I am especially grateful to Ursula Arnold for her invaluable administrative support. Last but not least, I would like to thank my friends and my family for their valuable suggestions and support. vii Contents Notation xi Abbreviations and Acronyms xv 1 Introduction 1 1.

[1]  Yasuhiko Watanabe,et al.  A trigonal prism-based method for hair image generation , 1992, IEEE Computer Graphics and Applications.

[2]  Masahide Kaneko,et al.  Coding of facial image sequence based on a 3-D model of the head and motion detection , 1991, J. Vis. Commun. Image Represent..

[3]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[4]  J. Ahlberg Extraction and Coding of Face Model Parameters , 1999 .

[5]  David R. Forsey,et al.  Hierarchical B-spline refinement , 1988, SIGGRAPH.

[6]  Gérard G. Medioni,et al.  Detection of Intensity Changes with Subpixel Accuracy Using Laplacian-Gaussian Masks , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yoshio Nagashima,et al.  Image analysis for face modeling and facial image reconstruction , 1990, Other Conferences.

[8]  Ajit Singh,et al.  An estimation-theoretic framework for image-flow computation , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[9]  C. R. Moloney Methods for illumination-independent processing of digital images , 1991, [1991] IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings.

[10]  Bernd Girod,et al.  Rate-constrained motion estimation , 1994, Other Conferences.

[11]  Yair Shoham,et al.  Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[12]  Marian Stewart Bartlett,et al.  Classifying Facial Action , 1995, NIPS.

[13]  A. Yuille Deformable Templates for Face Recognition , 1991, Journal of Cognitive Neuroscience.

[14]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[15]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Hiroshi Masuda,et al.  Data embedding algorithms for geometrical and non-geometrical targets in three-dimensional polygonal models , 1998, Comput. Commun..

[17]  Werner Blohm,et al.  Lightness determination at curved surfaces with applications to dynamic range compression and model-based coding of facial images , 1997, IEEE Trans. Image Process..

[18]  Tony F. Chan,et al.  An Improved Algorithm for Computing the Singular Value Decomposition , 1982, TOMS.

[19]  Haibo Li,et al.  Two-view facial movement estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[20]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[21]  Thomas Wiegand,et al.  Long-term memory motion-compensated prediction , 1999, IEEE Trans. Circuits Syst. Video Technol..

[22]  Chil-Woo Lee,et al.  Detection and Pose Estimation of Human Face with Multiple Model Images (Special Issue on Computer Vision) , 1994 .

[23]  H C Longuet-Higgins,et al.  The visual ambiguity of a moving plane , 1984, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[24]  S. Negahdaripour,et al.  Relaxing the Brightness Constancy Assumption in Computing Optical Flow , 1987 .

[25]  Jechang Jeong,et al.  Three-dimensional mesh warping for natural eye-to-eye contact in Internet video communication , 2000, Visual Communications and Image Processing.

[26]  Peter Eisert,et al.  Speech Driven Synthesis of Talking Head Sequences , 1997 .

[27]  Walter T Welford,et al.  Aberrations of the symmetrical optical system , 1974 .

[28]  Katsushi Ikeuchi,et al.  Reflectance Analysis for 3D Computer Graphics Model Generation , 1996, CVGIP Graph. Model. Image Process..

[29]  Reinhard Koch,et al.  Dynamic 3-D Scene Analysis Through Synthesis Feedback Control , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  H. Harashima,et al.  Model-Based Analysis Synthesis Coding of Videotelephone Images--Conception and Basic Study of Intelligent Image Coding-- , 1989 .

[31]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Shoji Tominaga,et al.  Estimating Reflection Parameters from a Single Color Image , 2000, IEEE Computer Graphics and Applications.

[33]  Belle L. Tseng,et al.  Multiviewpoint video coding with MPEG-2 compatibility , 1996, IEEE Trans. Circuits Syst. Video Technol..

[34]  Jörn Ostermann Object-based analysis-synthesis coding (OBASC) based on the source model of moving flexible 3-D objects , 1994, IEEE Trans. Image Process..

[35]  Soo-Chang Pei,et al.  Global motion estimation in model-based image coding by tracking three-dimensional contour feature points , 1998, IEEE Trans. Circuits Syst. Video Technol..

[36]  Tomaso A. Poggio,et al.  Motion Field and Optical Flow: Qualitative Properties , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Gary J. Sullivan,et al.  Motion compensation for video compression using control grid interpolation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[38]  Thomas Ertl,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 2014 .

[39]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[40]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[41]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Irfan Essa,et al.  Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[43]  Aggelos K. Katsaggelos,et al.  Fast and efficient mode and quantizer selection in the rate distortion sense for H.263 , 1996, Other Conferences.

[44]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[45]  M. Carter Computer graphics: Principles and practice , 1997 .

[46]  B K Horn,et al.  Calculating the reflectance map. , 1979, Applied optics.

[47]  Jörn Ostermann,et al.  Automatic adaptation of a face model in a layered coder with an object-based analysis-synthesis layer and a knowledge-based layer , 1997, Signal Process. Image Commun..

[48]  Sanjit K. Mitra,et al.  Efficient mode selection for block-based motion compensated video coding , 1995, Proceedings., International Conference on Image Processing.

[49]  Harvey J. Everett Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources , 1963 .

[50]  Thomas S. Huang,et al.  Motion and structure from feature correspondences: a review , 1994, Proc. IEEE.

[51]  Jürgen Stauder Schätzung der Szenenbeleuchtung aus Bewegtbildfolgen , 1999 .

[52]  John Lewis,et al.  Automated lip-sync: Background and techniques , 1991, Comput. Animat. Virtual Worlds.

[53]  Wilfried Enkelmann,et al.  Investigations of multigrid algorithms for the estimation of optical flow fields in image sequences , 1988, Comput. Vis. Graph. Image Process..

[54]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[55]  Peter Eisert,et al.  Model-based Coding of Facial Image Sequences at Varying Illumination Conditions , 1998 .

[56]  Kosuke Sato,et al.  Determining Reflectance Properties of an Object Using Range and Brightness Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Kiyoharu Aizawa,et al.  Analysis and synthesis of facial image sequences in model-based image coding , 1994, IEEE Trans. Circuits Syst. Video Technol..

[58]  C. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[59]  Gunther Wyszecki,et al.  Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .

[60]  Roy Hall,et al.  Illumination and Color in Computer Generated Imagery , 1988, Monographs in Visual Communication.

[61]  Ken Shoemake,et al.  Animating rotation with quaternion curves , 1985, SIGGRAPH.

[62]  Dimitris N. Metaxas,et al.  The integration of optical flow and deformable models with applications to human face shape and motion estimation , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63]  Peter Eisert,et al.  Model-based estimation of facial expression parameters from image sequences , 1997, Proceedings of International Conference on Image Processing.

[64]  Subhasis Chaudhuri,et al.  Recursive estimation of illuminant motion from flow field , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[65]  D. E. Pearson,et al.  Developments in model-based video coding , 1995, Proc. IEEE.

[66]  Hans-Peter Seidel,et al.  Faster evaluation of quadratic bi-variate DMS spline surfaces , 1994 .

[67]  Takeo Kanade,et al.  Surface Reflection: Physical and Geometrical Perspectives , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  R. Ladner Entropy-constrained Vector Quantization , 2000 .

[69]  Andy C. Downton,et al.  A switched model-based coder for video signals , 1994, IEEE Trans. Circuits Syst. Video Technol..

[70]  Haibo Li,et al.  Image sequence coding at very low bit rates: a review , 1994, IEEE Trans. Image Process..

[71]  Christophe Schlick,et al.  A Survey of Shading and Reflectance Models , 1994, Comput. Graph. Forum.

[72]  Thomas S. Huang,et al.  Automatic construction of 3D human face models based on 2D images , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[73]  R. RaoGeorgia,et al.  EXPLOITING AUDIO-VISUAL CORRELATION IN CODING OFTALKING HEAD SEQUENCESRam , 1996 .

[74]  Marc Rioux,et al.  Color Reflectance Modeling Using a Polychromatic Laser Range Sensor , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  David A. Forsyth,et al.  Reflections on Shading , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[76]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[77]  JU SHANONX. RECOGNIZING HUMAN MOTION USING PARAMETERIZED MODELS OF OPTICAL FLOW , 2022 .

[78]  Peter Eisert,et al.  Model-aided coding: a new approach to incorporate facial animation into motion-compensated video coding , 2000, IEEE Trans. Circuits Syst. Video Technol..

[79]  Matthew Stone,et al.  An anthropometric face model using variational techniques , 1998, SIGGRAPH.

[80]  Heinrich H Bülthoff,et al.  Why the visual recognition system might encode the effects of illumination , 1998, Vision Research.

[81]  C. Hjortsjö Man's face and mimic language , 1969 .

[82]  Pat Hanrahan,et al.  A realistic camera model for computer graphics , 1995, SIGGRAPH.

[83]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[84]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[85]  J. Salz,et al.  Algorithms for estimation of three-dimensional motion , 1985, AT&T Technical Journal.

[86]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[87]  Janusz Konrad,et al.  Motion estimation and compensation under varying illumination , 1994, Proceedings of 1st International Conference on Image Processing.

[88]  F. F. Soulié,et al.  Connectionists Methods for Human Face Processing , 1998 .

[89]  Peter Eisert,et al.  Rate-distortion-efficient video compression using a 3-D head model , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[90]  Peter Eisert,et al.  Digital watermarking of MPEG-4 facial animation parameters , 1998, Comput. Graph..

[91]  Ioannis Pitas,et al.  Extraction of facial regions and features using color and shape information , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[92]  Eero P. Simoncelli Design of multi-dimensional derivative filters , 1994, Proceedings of 1st International Conference on Image Processing.

[93]  Kiyoharu Aizawa,et al.  Model-based image coding , 1994, Other Conferences.

[94]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[95]  Gene H. Golub,et al.  Matrix computations , 1983 .

[96]  Touradj Ebrahimi,et al.  Dynamic video coding-an overview , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[97]  L. Farkas Anthropometry of the head and face , 1994 .

[98]  Timothy A. Clarke,et al.  Comparison of some techniques for the subpixel location of discrete target images , 1994, Other Conferences.

[99]  E. Land,et al.  Lightness and retinex theory. , 1971, Journal of the Optical Society of America.

[100]  Nikolaos Grammalidis,et al.  Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation , 1997, IEEE Trans. Circuits Syst. Video Technol..

[101]  Masahide Kaneko,et al.  Interactive Model-based Coding for Multimedia E-mail Environment , 1996 .

[102]  Kiyoharu Aizawa,et al.  An intelligent facial image coding driven by speech and phoneme , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[103]  Peter Eisert,et al.  Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[104]  C. Cacou Anthropometry of the head and face , 1995 .

[105]  Kiyoharu Aizawa,et al.  Model-based analysis synthesis image coding (MBASIC) system for a person's face , 1989, Signal Process. Image Commun..

[106]  Fabio Lavagetto,et al.  Lip motion modeling and speech driven estimation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[107]  Roger Mohr,et al.  Accuracy in image measure , 1994, Other Conferences.

[108]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[109]  Jürgen Stauder Illumination analysis for synthetic/natural hybrid image sequence generation , 1998, Proceedings. Computer Graphics International (Cat. No.98EX149).

[110]  Peter Eisert,et al.  Model-based 3D-motion estimation with illumination compensation , 1997 .

[111]  Rama Chellappa,et al.  Estimation of Illuminant Direction, Albedo, and Shape from Shading , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[112]  Keith Waters,et al.  A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[113]  Pertti Roivainen,et al.  3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[114]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[115]  Sanjit K. Mitra,et al.  Rate-distortion optimized mode selection for very low bit rate video coding and the emerging H.263 standard , 1996, IEEE Trans. Circuits Syst. Video Technol..

[116]  Kiyoharu Aizawa,et al.  Model-based image coding advanced video coding techniques for very low bit-rate applications , 1995, Proc. IEEE.

[117]  大野 義夫,et al.  Computer Graphics : Principles and Practice, 2nd edition, J.D. Foley, A.van Dam, S.K. Feiner, J.F. Hughes, Addison-Wesley, 1990 , 1991 .

[118]  Roberto Brunelli,et al.  Estimation of pose and illuminant direction for face processing , 1994, Image Vis. Comput..

[119]  K. Torrance,et al.  Theory for off-specular reflection from roughened surfaces , 1967 .

[120]  Michael G. Strintzis,et al.  Object-based coding of stereo image sequences using three-dimensional models , 1997, IEEE Trans. Circuits Syst. Video Technol..

[121]  A. Pentland Finding the illuminant direction , 1982 .

[122]  Skjalg Lepsøy,et al.  Conversion of articulatory parameters into active shape model coefficients for lip motion representation and synthesis , 1998, Signal Process. Image Commun..

[123]  Dimitris N. Metaxas,et al.  Deformable model-based shape and motion analysis from images using motion residual error , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[124]  Timothy F. Cootes,et al.  Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[125]  Shahriar Negahdaripour,et al.  A generalized brightness change model for computing optical flow , 1993, 1993 (4th) International Conference on Computer Vision.

[126]  Paul R. Cohen,et al.  Camera Calibration with Distortion Models and Accuracy Evaluation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[127]  Eric Dubois,et al.  Estimation of motion fields from image sequences with illumination variation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[128]  Keith Waters,et al.  Computer Facial Animation, Second Edition , 1996 .

[129]  Alex Pentland Photometric Motion , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[130]  Ingemar J. Cox,et al.  Predicting and Estimating the Accuracy of a Subpixel Registration Algorithm , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[131]  J. Cohen,et al.  Color Science: Concepts and Methods, Quantitative Data and Formulas , 1968 .

[132]  Alex Pentland,et al.  A vision system for observing and extracting facial action parameters , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[133]  Bülent Sankur,et al.  Facial feature localization and adaptation of a generic face model for model-based coding , 1995, Signal Process. Image Commun..

[134]  Itu-T Video coding for low bitrate communication , 1996 .

[135]  A. Murat Tekalp,et al.  3-D motion estimation and wireframe adaptation including photometric effects for model-based coding of facial image sequences , 1994, IEEE Trans. Circuits Syst. Video Technol..

[136]  Thomas S. Huang,et al.  3D head pose computation from 2D images: templates versus features , 1995, Proceedings., International Conference on Image Processing.

[137]  Daniel Thalmann,et al.  Virtual Human Representation and Communication in VLNet , 1997, IEEE Computer Graphics and Applications.

[138]  Haibo Li,et al.  Representing and compressing facial animation parameters using facial action basis functions , 1999, IEEE Trans. Circuits Syst. Video Technol..

[139]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[140]  Frederic Dufaux,et al.  Motion estimation techniques for digital TV: a review and a new contribution , 1995, Proc. IEEE.

[141]  Charles A. Poynton,et al.  Gamma and Its Disguises : The Nonlinear Mappings of Intensity in Perception, CRTs, Film, and Video , 1993 .

[142]  Ken-ichi Anjyo,et al.  A simple method for extracting the natural beauty of hair , 1992, SIGGRAPH.

[143]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[144]  Yao Wang,et al.  Speech-assisted lip synchronization in audio-visual communications , 1995, Proceedings., International Conference on Image Processing.

[145]  Thomas S. Huang,et al.  Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[146]  오승준 [서평]「Digital Video Processing」 , 1996 .

[147]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[148]  Bernd Girod,et al.  Image sequence coding using 3D scene models , 1994, Other Conferences.

[149]  Jürgen Stauder,et al.  Estimation of point light source parameters for object-based coding , 1995, Signal Process. Image Commun..

[150]  Horace Ho-Shing Ip,et al.  Script-based facial gesture and speech animation using a NURBS based face model , 1996, Comput. Graph..

[151]  Thomas Vetter,et al.  Estimating Coloured 3D Face Models from Single Images: An Example Based Approach , 1998, ECCV.

[152]  Katsushi Ikeuchi,et al.  Object shape and reflectance modeling from observation , 1997, SIGGRAPH.

[153]  Azriel Rosenfeld,et al.  Improved Methods of Estimating Shape from Shading Using the Light Source Coordinate System , 1985, Artif. Intell..

[154]  G. A. Thomas,et al.  Television motion measurement for DATV and other applications , 1987 .

[155]  Hans Georg Musmann A layered coding system for very low bit rate video coding , 1995, Signal Process. Image Commun..

[156]  Marian Stewart Bartlett,et al.  Classifying Facial Actions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[157]  Adam Finkelstein,et al.  Robust mesh watermarking , 1999, SIGGRAPH.

[158]  Oliver Benedens,et al.  Geometry-Based Watermarking of 3D Models , 1999, IEEE Computer Graphics and Applications.

[159]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[160]  Roberto Cipolla,et al.  Determining the gaze of faces in images , 1994, Image Vis. Comput..