A model problem in the representation of digital image sequences

Abstract An optimal feature based representation is computed for the characterization of sequences of a family of digital images arising in the animation of a speaking face. The relation between spatial and temporal correlations is considered and two different experiments are presented. A low-dimensional characterization of lip motion is generated in terms of the 20 most significant features. The mathematical framework allows both the synthesis (generation of real and simulated motion) and analysis (classification) of lip motion. Words are represented both as small matrices and as curves in the plane and are shown to have distinct signatures which are apparently robust. Typical compression ratios vary from O (100:1) to O (1000:1).

[1]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[2]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[3]  Michel Loève,et al.  Probability Theory I , 1977 .

[4]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Nadine Aubry,et al.  Spatiotemporal analysis of complex signals: Theory and applications , 1991 .

[6]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.