Analysis, interpretation and synthesis of facial expressions

This thesis describes a computer vision system for observing the "action units" of a face using video sequences as input. The visual observation (sensing) is achieved by using an optimal estimation optical flow method coupled with a geometric and a physical (muscle) model describing the facial structure. This modeling results in a time-varying spatial patterning of facial shape and a parametric representation of the independent muscle action groups responsible for the observed facial motions. These muscle action patterns are then used for analysis, interpretation, recognition, and synthesis of facial expressions. Thus, by interpreting facial motions within a physics-based optimal estimation framework, a new control model of facial movement is developed. The newly extracted action units (which we name "FACS+") are both physics and geometry-based, and extend the well known FACS parameters for facial expressions by adding temporal information and non-local spatial patterning of facial motion. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  I. Pilowsky,et al.  Towards the quantification of facial expressions with the use of a mathematic model of the face , 1986 .

[2]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3]  Christopher G. Goetz The Mechanism of Human Facial Expression (Studies in Emotion and Social Interaction) , 1991, Neurology.

[4]  D. Lessing,et al.  The Four-Gated City , 1969 .

[5]  Alex Pentland,et al.  Correlation and Interpolation Networks for Real-time Expression Analysis/Synthesis , 1994, NIPS.

[6]  Larry S. Davis,et al.  Computing spatio-temporal representations of human faces , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  L. Segerlind Applied Finite Element Analysis , 1976 .

[8]  Frederic I. Parke,et al.  Control Parameterization for Facial Animation , 1991 .

[9]  J. N. Bassili Facial motion in the perception of faces and of emotional expression. , 1978, Journal of experimental psychology. Human perception and performance.

[10]  Thomas S. Huang,et al.  Final Report To NSF of the Planning Workshop on Facial Expression Understanding , 1992 .

[11]  Eric D. Petajan Automatic lipreading to enhance speech recognition , 1984 .

[12]  L YuilleAlan,et al.  Feature extraction from faces using deformable templates , 1992 .

[13]  Thoms M. Levergood,et al.  DEC face: an automatic lip-synchronization algorithm for synthetic faces , 1993 .

[14]  Alex Pentland,et al.  Interactive-time vision: face recognition as a visual behavior , 1991 .

[15]  Lance Williams,et al.  Performance-driven facial animation , 1990, SIGGRAPH.

[16]  Amir W. Al-Khafaji,et al.  Numerical methods in engineering practice , 1986 .

[17]  John P. Lewis,et al.  Automated lip-synch and speech synthesis for character animation , 1987, CHI 1987.

[18]  N. H. Frijda,et al.  Facial expression processing. , 1986 .

[19]  Irfan Essa,et al.  Physically-based Modeling for Graphics and Vision , 1992 .

[20]  Norman I. Badler,et al.  Animating facial expressions , 1981, SIGGRAPH '81.

[21]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[22]  Alex Pentland,et al.  A Unified Approach for Physical and Geometric Modeling for Graphics and Animation , 1992, Comput. Graph. Forum.

[23]  Robert Grover Brown,et al.  Introduction to random signal analysis and Kalman filtering , 1983 .

[24]  Marcel J. T. Reinders,et al.  Tracking of global motion and facial expressions of a human face in image sequences , 1993, Other Conferences.

[25]  D Terzopoulos,et al.  The computer synthesis of expressive faces. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[26]  A.H. Haddad,et al.  Applied optimal estimation , 1976, Proceedings of the IEEE.

[27]  Alan Jeffrey Goldschen,et al.  Continuous automatic speech recognition by lipreading , 1993 .

[28]  J. N. Bassili Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. , 1979, Journal of personality and social psychology.

[29]  H. Saunders,et al.  Finite element procedures in engineering analysis , 1982 .

[30]  Steven D. Pieper,et al.  Interactive graphics for plastic surgery: a task-level analysis and implementation , 1992, I3D '92.

[31]  Bill Welsh,et al.  Model-based coding of images , 1991 .

[32]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[33]  G. Boulogne,et al.  The Mechanism of Human Facial Expression , 1990 .

[34]  Alex Pentland,et al.  ALIVE: Artificial Life Interactive Video Environment , 1994, AAAI.

[35]  V. Bruce Face recognition : a special issue of the European journal of cognitive psychology , 1991 .

[36]  Keith Waters,et al.  Computer facial animation , 1996 .

[37]  Alex Pentland,et al.  Recursive estimation of structure and motion using relative orientation constraints , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[38]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[39]  P. Ekman Facial expressions of emotion: an old controversy and new findings. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[40]  Michèle Basseville,et al.  Modeling and estimation of multiresolution stochastic processes , 1992, IEEE Trans. Inf. Theory.

[41]  Demetri Terzopoulos,et al.  A physical model of facial tissue and muscle articulation , 1990, [1990] Proceedings of the First Conference on Visualization in Biomedical Computing.

[42]  Lance Williams,et al.  Animating images with drawings , 1994, SIGGRAPH.

[43]  Peter C. Litwinowicz,et al.  Facial Animation by Spatial Mapping , 1991 .

[44]  Stephen Michael Platt,et al.  A structural model of the human face (graphics, animation, object representation) , 1985 .

[45]  Michael M. Cohen,et al.  Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[46]  Demetri Terzopoulos,et al.  Modelling and animating faces using scanned data , 1991, Comput. Animat. Virtual Worlds.

[47]  H. Yahia,et al.  Facial Animation With Muscle and Wrinkle Simulation , 1993 .

[48]  Tsuneya Kurihara,et al.  A Transformation Method for Modeling and Animation of the Human Face from Photographs , 1991 .

[49]  Alex Pentland,et al.  Generalized implicit functions for computer graphics , 1991, SIGGRAPH.

[50]  Takeo Kanade,et al.  Computer recognition of human faces , 1980 .

[51]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[52]  Eric David Petajan,et al.  Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .

[53]  A. Young,et al.  Aspects of face processing , 1986 .

[54]  Eero P. Simoncelli Distributed representation and analysis of visual motion , 1993 .

[55]  C. Darwin The Expression of the Emotions in Man and Animals , .

[56]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Vicki Bruce,et al.  Processing Images of Faces , 1992 .

[58]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[59]  Dimitris N. Metaxas,et al.  Shape and Nonrigid Motion Estimation Through Physics-Based Synthesis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Michael M. Cohen,et al.  Real-time analysis-synthesis and intelligibility of talking faces , 1994, SSW.

[61]  J. P. Lewis,et al.  Automated lip-synch and speech synthesis for character animation , 1987, CHI '87.

[62]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[63]  Reinhard Koch,et al.  Dynamic 3-D Scene Analysis Through Synthesis Feedback Control , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  M. Minsky The Society of Mind , 1986 .

[67]  Edward H. Adelson,et al.  Layered representation for motion analysis , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Keith Waters,et al.  A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[69]  Frederic I. Parke,et al.  Techniques for facial animation , 1991 .

[70]  Pertti Roivainen,et al.  3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  N. Badler,et al.  Linguistic Issues in Facial Animation , 1991 .

[72]  J. Bruner,et al.  THE PERCEPTION OF PEOPLE , 1954 .

[73]  Speech dialogue with facial displays , 1994, CHI '94.

[74]  Alex Pentland,et al.  Closed-Form Solutions for Physically Based Shape Modeling and Recognition , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  Steven D. Pieper,et al.  CAPS: computer-aided plastic surgery , 1992 .

[76]  Bernard Friedland,et al.  Control System Design: An Introduction to State-Space Methods , 1987 .

[77]  M. Alexander,et al.  Principles of Neural Science , 1981 .

[78]  K. C. Chou,et al.  Recursive and iterative estimation algorithms for multiresolution stochastic processes , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[79]  L. Quam Hierarchical warp stereo , 1987 .

[80]  G. Arundale The Psychology of the Emotions , 1898, Nature.

[81]  Keith Waters,et al.  Physical model of facial tissue and muscle articulation derived from computer tomography data , 1992, Other Conferences.

[82]  Alex Pentland,et al.  Automatic lipreading by optical-flow analysis , 1989 .