MPEG-4 Face and Body Animation Coding Applied to HCI

The MPEG-4 Face and Body Animation (FBA) standard provides a comprehensive description of humanoid geometry and animation with a very low bit-rate codec for Face and Body Animation Parameters (FAPs and BAPs) enabling transmission of MPEG-4 FBA streams over any digital network. Human behavior captured on video can be converted to an FBA stream for subsequent use in HCI systems that operate locally or over a network in a client-server architecture. Visual communication, animated entertainment, audio-visual speech and speaker recognition, and gesture recognition can be performed directly using the FBA stream anywhere in the network when local resources are limited.

[1]  Marian Stewart Bartlett,et al.  Face image analysis by unsupervised learning , 2001 .

[2]  Andreas Wichert,et al.  Audio-visual sensor fusion with neural architectures , 1999, AVSP.

[3]  Algirdas Pakstas,et al.  MPEG-4 Facial Animation: The Standard,Implementation and Applications , 2002 .

[4]  David C. Gibbon,et al.  Multi-modal system for locating heads and faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[5]  Paul Ekman,et al.  Emotions inside out. 130 Years after Darwin's "The Expression of the Emotions in Man and Animal". , 2003, Annals of the New York Academy of Sciences.

[6]  Yochai Konig,et al.  A hybrid approach to bimodal speech recognition , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[7]  Alan C. Bovik,et al.  Medium Vocabulary Audiovisual Speech Recognition , 1995 .

[8]  Jörn Ostermann,et al.  Efficient modeling of virtual humans in MPEG-4 , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[9]  Kostas Karpouzis,et al.  Parameterized Facial Expression Synthesis Based on MPEG-4 , 2002, EURASIP J. Adv. Signal Process..

[10]  Partha Niyogi,et al.  Feature based representation for audio-visual speech recognition , 1999, AVSP.

[11]  Paul Mineiro,et al.  A diffusion network approach to visual speech recognition , 1999, AVSP.

[12]  Nikolaos Grammalidis,et al.  Three-Dimensional Facial Adaptation for MPEG-4 Talking Heads , 2002, EURASIP J. Adv. Signal Process..

[13]  Georg Fries,et al.  A tool for designing MPEG-4 compliant expressions and animations on VRML cartoon-faces , 1999, AVSP.

[14]  D. Stork,et al.  Speechreading by Man and Machine: Models, Systems, and Applications , 1996 .

[15]  Benoît Maison,et al.  On the use of visual information for improving audio-based speaker recognition , 1999, AVSP.

[16]  Eric David Petajan,et al.  Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .

[17]  Igor S. Pandzic,et al.  MPEG-4 Facial Animation , 2002 .

[18]  Juergen Luettin,et al.  Active Shape Models for Visual Speech Feature Extraction , 1996 .

[19]  Oscar N. Garcia,et al.  Continuous optical automatic speech recognition by lipreading , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[20]  P. Ekman,et al.  What the face reveals : basic and applied studies of spontaneous expression using the facial action coding system (FACS) , 2005 .

[21]  James F. Baldwin,et al.  Automatic computer lip-reading using fuzzy set theory , 1999, AVSP.

[22]  Nam Ling,et al.  Optimizing facial animation parameters for MPEG-4 , 2003, IEEE Trans. Consumer Electron..

[23]  Ali Adjoudani,et al.  Audio-visual speech recognition compared across two architectures , 1995, EUROSPEECH.