A three-dimensional model of human lip motions trained from video

We present a 3D model of human lips and develop a framework for training it from real data. The model starts off with generic physics specified with the finite element method and "learns" the correct physics through observations. The model's physics allow physically-based regularization between sparse observation points and the resulting set of deformations are used to derive the correct physical modes of the model. Preliminary results showing the model's ability to reconstruct lip shapes from sparse data are shown. The resulting model can be used for both analysis and synthesis.

[1]  Keith Waters,et al.  A coordinated muscle model for speech animation , 1995 .

[2]  Lorenzo Torresani,et al.  2D Deformable Models for Visual Speech Analysis , 1996 .

[3]  Henry Stark,et al.  Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[4]  A. Adjoudani,et al.  On the Integration of Auditory and Visual Parameters in an HMM-based ASR , 1996 .

[5]  H. Saunders,et al.  Finite element procedures in engineering analysis , 1982 .

[6]  Juergen Luettin,et al.  Visual speech recognition using active shape models and hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Alex Pentland,et al.  Shape analysis of brain structures using physical and experimental modes , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Eric D. Petajan Automatic lipreading to enhance speech recognition , 1984 .

[9]  Stephen M. Omohundro,et al.  Nonlinear Image Interpolation using Manifold Learning , 1994, NIPS.

[10]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[11]  O. Zienkiewicz,et al.  The finite element method in structural and continuum mechanics, numerical solution of problems in structural and continuum mechanics , 1967 .

[12]  Alex Pentland,et al.  Camera Self-Calibration From One Point Correspondence , 1995 .

[13]  Alexander H. Waibel,et al.  See Me, Hear Me: Integrating Automatic Speech Recognition and Lip-reading , 1994 .

[14]  Irfan Essa,et al.  Analysis, interpretation and synthesis of facial expressions , 1995 .

[15]  Alex Pentland,et al.  Motion regularization for model-based head tracking , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  Eric David Petajan,et al.  Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .