Automated lip-sync: Background and techniques

SUMMARY The problem of creating mouth animation synchronized to recorded speech is discussed. Review of a model of speech sound generation indicates that the automatic derivation of mouth movement from a speech soundtrack is a tractable problem. Several automatic lip-sync techniques are compared, and one method is described in detail. In this method a common speech synthesis method, linear prediction, is adapted to provide simple and accurate phoneme recognition. The recognized phonemes are associated with mouth positions to provide keyframes for computer animation of speech. Experience with this technique indicates that automatic lipsync can produce useful results.

[1]  N. Levinson The Wiener (Root Mean Square) Error Criterion in Filter Design and Prediction , 1946 .

[2]  Brian Wyvill,et al.  Speech and expression: a computer solution to face animation , 1986 .

[3]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[4]  J. P. Lewis,et al.  Automated lip-synch and speech synthesis for character animation , 1987, CHI '87.

[5]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[6]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[7]  J. L. Hock,et al.  An exact recursion for the composite nearest‐neighbor degeneracy for a 2×N lattice space , 1984 .

[8]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[9]  Patrick Purcell,et al.  Soft Machine: A Personable Interface , 1984 .

[10]  S. J. Young Principles of Computer Speech , 1983 .

[11]  Frederic I. Parke,et al.  A parametric model for human faces. , 1974 .

[12]  Keith Waters,et al.  A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[13]  N. Wiener The Wiener RMS (Root Mean Square) Error Criterion in Filter Design and Prediction , 1949 .

[14]  Peggy Weil,et al.  ABOUT FACE : COMPUTERGRAPHIC SYNTHESIS AND MANIPULATION OF FACIAL IMAGERY , 2013 .

[15]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[16]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[17]  T.H. Crystal,et al.  Linear prediction of speech , 1977, Proceedings of the IEEE.

[18]  R.B. Lake,et al.  Programs for digital signal processing , 1981, Proceedings of the IEEE.