Communication and coarticulation in facial animation

Our goal is to produce a high level programming language or tool for 3D animation of facial expressions, especially, those conveying information correlated with the intonation of the voice: this includes the differences of timing, pitch, and emphasis that are related to such semantic distinctions of discourse as "given" and "new" information, some of which are also correlated with affect or emotion. Up till now, systems have not embodied such rule-governed translation from speech and utterance meaning to facial expressions. Our algorithm embodies rules that describe and coordinate these relations (intonation/information, intonation/emotions and facial expression/emotions). Given an utterance, we consider how the discourse information (what is new/old information in the given context, or what is the "topic" of the discourse) is transmitted through the choice of accents and their placement, how it is conveyed over facial expression and how the two are coordinated. The facial model integrates the action at several levels, including individual muscle, group of muscles, and eye- and head-motion, as well as the propagation of or interaction of these movements, especially coarticulation effects. This study offers a higher level of representation of facial actions by grouping them into specialized functions (lip shapes for phonemes, eyebrow and head motions as emphatic movements). The major "key phrases" of this work involves the integration of FACS (facial notational system derived by P. Ekman and W. Friesen), and the Action Units (muscle actions); it offers a solution to lip synchronization as well as it provides a repertory of the different types of facial expressions involved with speech; it considers speaker/listener interaction. This representation is used to drive an animation system linked to facial motion.

[1]  C. Darwin The Expression of the Emotions in Man and Animals , .

[2]  G. Fairbanks,et al.  An experimental study of the pitch characteristics of the voice during the expression of emotion , 1939 .

[3]  Ray L. Birdwhistell,et al.  Introduction to kinesics : an annotation system for analysis of body motion and gesture , 1952 .

[4]  Konstantin Stanislavsky,et al.  Creating a Role , 1957 .

[5]  E. Uldall Attitudinal Meanings Conveyed by Intonation Contours , 1960 .

[6]  J. Davitz,et al.  The communication of emotional meaning , 1964 .

[7]  A. Scheflen THE SIGNIFICANCE OF POSTURE IN COMMUNICATION SYSTEMS. , 1964, Psychiatry.

[8]  V. Fromkin Lip Positions in American English Vowels , 1964 .

[9]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[10]  D. Abercrombie,et al.  Elements of General Phonetics , 1967 .

[11]  A. Dittmann,et al.  The phonemic clause as a unit of speech decoding. , 1967, Journal of personality and social psychology.

[12]  W. S. Condon,et al.  A segmentation of behavior , 1967 .

[13]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[14]  C. G. Fisher,et al.  Confusions among visually perceived consonants. , 1968, Journal of speech and hearing research.

[15]  A. Dittmann,et al.  Relationship between vocalizations and head nods as listener responses. , 1968, Journal of personality and social psychology.

[16]  E. C. Grant,et al.  An ethological description of non-verbal behaviour during interviews. , 1968, The British journal of medical psychology.

[17]  A. Dittmann,et al.  Body movement and speech rhythm in social conversation. , 1969, Journal of personality and social psychology.

[18]  A. Mehrabian Significance of posture and posiion in the communication of attitude and status relationships. , 1969, Psychological bulletin.

[19]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[20]  E. C. Grant,et al.  Human Facial Expression , 1969 .

[21]  A. Kendon Movement coordination in social interaction: some examples described. , 1970, Acta psychologica.

[22]  James J. Jenkins,et al.  The Perception of Language. , 1971 .

[23]  J. Lindenfeld Verbal and Non-Verbal Elements in Discourse , 1971 .

[24]  R. Birdwhistell Kinesics and Context: Essays on Body Motion Communication , 1971 .

[25]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[26]  A. Dittmann Interpersonal Messages of Emotion , 1972 .

[27]  A. Dittmann Developmental Factors in Conversational Behavior , 1972 .

[28]  K. Stevens,et al.  Emotions and speech: some acoustical correlates. , 1972, The Journal of the Acoustical Society of America.

[29]  Raymond D. Kent,et al.  Tongue body articulation during vowel and diphthong gestures. , 1972, Folia phoniatrica.

[30]  M. Knapp,et al.  Nonverbal communication in human interaction , 1972 .

[31]  A. W. Siegman,et al.  Studies in dyadic communication. , 1972 .

[32]  M. Bruchon-Schweitzer Les mouvements expressifs et la personnalité , 1973 .

[33]  F. Quitkin,et al.  Body movements and the verbal encoding of aggressive affect. , 1973, Journal of Personality and Social Psychology.

[34]  J. D. Starkey Toward a Grammar for Dyadic Conversation , 1973 .

[35]  S. Duncan,et al.  On the structure of speaker–auditor interaction during speaking turns , 1974, Language in Society.

[36]  W. S. Condon,et al.  Neonate Movement Is Synchronized with Adult Speech: Interactional Participation and Language Acquisition , 1974, Science.

[37]  E Bizzi,et al.  The coordination of eye-head movements. , 1974, Scientific American.

[38]  R. Battison,et al.  Phonological Deletion in American Sign Language , 2013 .

[39]  W. S. Condon,et al.  Synchrony demonstrated between movements of the neonate and adult speech. , 1974, Child development.

[40]  P. Ladefoged A course in phonetics , 1975 .

[41]  P. Ekman,et al.  Unmasking the face : a guide to recognizing emotions from facial clues , 1975 .

[42]  William Lord,et al.  Speech Pitch Frequency as an Emotional State Indicator , 1975, IEEE Transactions on Systems, Man, and Cybernetics.

[43]  M. Knapp,et al.  Turn-Taking in Conversations. , 1975 .

[44]  D. Klatt Vowel Lengthening is Syntactically Determined in a Connected Discourse. , 1975 .

[45]  D. Crystal The English tone of voice: essays in intonation, prosody and paralanguage / David Crystal , 1975 .

[46]  H. G. Johnson,et al.  COMMUNICATIVE BODY MOVEMENTS: AMERICAN EMBLEMS , 1975 .

[47]  Frederic I. Parke A model for human faces that allows speech synchronized animation , 1975, Comput. Graph..

[48]  Raymond D. Kent,et al.  Articulatory timing in selected consonant sequences , 1975, Brain and Language.

[49]  Insup Taylor Introduction to psycholinguistics , 1976 .

[50]  William S. Condon,et al.  An Analysis of Behavioral Organization , 2013 .

[51]  James F. Blinn,et al.  Texture and reflection in computer generated images , 1976, CACM.

[52]  P. Ekman Movements with Precise Meanings , 1976 .

[53]  P. Ekman,et al.  BODY MOVEMENT AND VOICE PITCH IN DECEPTIVE INTERACTION , 1976 .

[54]  Jonathan Benthall,et al.  The Body as a Medium of Expression. , 1976 .

[55]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[56]  Paralanguage and Kinesics , 1976 .

[57]  Raymond D. Kent,et al.  Coarticulation in recent speech production models , 1977 .

[58]  A. Kendon,et al.  Differential Perception and Attentional Frame in Face-to-Face Interaction: Two Problems for Investigation , 1978 .

[59]  Arthur N. Wiens,et al.  Nonverbal Communication: The State of the Art , 1978 .

[60]  Communicating with deaf people , 1978 .

[61]  I. Fónagy A New Method of Investigating the Perception of Prosodic Features , 1978, Language and speech.

[62]  Thomas A. Sebeok,et al.  Sight, Sound, and Sense , 2021 .

[63]  Bonnie Gough,et al.  Verbs in American Sign Language , 2013 .

[64]  D. Ladd The structure of intonational meaning , 1978 .

[65]  D. E. Allen,et al.  Conversation analysis : the sociology of talk , 1978 .

[66]  C. Izard Emotions in Personality and Psychopathology , 1979 .

[67]  P. Ekman,et al.  Facial Expressions of Emotion , 1979 .

[68]  E. Klima The signs of language , 1979 .

[69]  P. Ekman,et al.  Facial signs of emotional experience. , 1980 .

[70]  P. Ekman,et al.  Deliberate facial movement. , 1980 .

[71]  John B. Gatewood,et al.  Interactional synchrony: Genuine or spurious? A critique of recent research , 1981 .

[72]  Andrew P. Thomas,et al.  The role of pre‐speech posture change in dyadic interaction , 1981 .

[73]  P. Ekman Mistakes When Deceiving , 1981 .

[74]  G. Forsyth,et al.  Human facial expression judgment in a conversational context , 1981 .

[75]  J. Darby Speech evaluation in psychiatry , 1981 .

[76]  P. Ekman,et al.  The symmetry of emotional and deliberate facial actions. , 1981, Psychophysiology.

[77]  Norman I. Badler,et al.  Animating facial expressions , 1981, SIGGRAPH '81.

[78]  Maciej Pakosz Intonation and attitude , 1982 .

[79]  P. Ekman,et al.  Handbook of methods in nonverbal behavior research , 1982 .

[80]  J. Donald Ragsdale,et al.  Distribution of Kinesic Hesitation Phenomena in Spontaneous Speech , 1982 .

[81]  P. Ekman,et al.  Felt, false, and miserable smiles , 1982 .

[82]  P. Ekman Emotion in the human face , 1982 .

[83]  Michiel Wiggers,et al.  Judgments of facial expressions of emotion predicted from facial behavior , 1982 .

[84]  Ian H. Witten Principles of computer speech , 1982 .

[85]  Daniel Druckman,et al.  Nonverbal Communication: Survey, Theory, and Research , 1982 .

[86]  Maciej Pakosz Attitudinal judgments in intonation: Some evidence for a theory , 1983, Journal of Psycholinguistic Research.

[87]  U. Hadar,et al.  Head Movement Correlates of Juncture and Stress at Sentence Level , 1983, Language and speech.

[88]  M. Walker,et al.  The expressive function of the eye flash , 1983 .

[89]  Uri Hadar,et al.  Kinematics of head movements accompanying speech during conversation , 1983 .

[90]  Klaus R. Scherer,et al.  Cross-national research on antecedents and components of emotion: A progress report , 1983 .

[91]  L. Streeter,et al.  Acoustic and perceptual indicators of emotional stress. , 1983, The Journal of the Acoustical Society of America.

[92]  Daniel C. O'Connell,et al.  Evidence for the phonemic clause as an encoding unit , 1984 .

[93]  U. Hadar,et al.  The Relationship Between Head Movements and Speech Dysfluencies , 1984, Language and speech.

[94]  Kim E. A. Silverman,et al.  Vocal cues to speaker affect: testing two models , 1984 .

[95]  P. Ekman,et al.  Approaches To Emotion , 1985 .

[96]  H. Wallbott,et al.  Hand movement quality: a neglected aspect of nonverbal behavior in clinical judgment and person perception. , 1985, Journal of clinical psychology.

[97]  S. Duncan,et al.  Interaction Structure and Strategy , 1985 .

[98]  P. Ekman,et al.  What you say and how you say it: the contribution of speech content and voice quality to judgments of others. , 1985, Journal of personality and social psychology.

[99]  Peter Bull,et al.  Body movement and emphasis in speech , 1985 .

[100]  G. Miller,et al.  Handbook of Interpersonal Communication , 1985 .

[101]  U. Hadar,et al.  Head movement during listening turns in conversation , 1985 .

[102]  Kim E. A. Silverman,et al.  Evidence for the independent function of intonation contour type, voice quality, and F0 range in signaling speaker affect , 1985 .

[103]  C. Gallois,et al.  The last ten turns: Behavior and sequencing in friends' and strangers' conversational findings , 1985 .

[104]  Julia Hirschberg,et al.  Implicating Uncertainty: The Pragmatics of Fall-Rise Intonation , 1985 .

[105]  H. Wallbott,et al.  Contributions of the german “expression psychology” to nonverbal behavior research part IV: The voice , 1986 .

[106]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[107]  Patricia Webbink,et al.  The power of the eyes , 1986 .

[108]  Brian Wyvill,et al.  Speech and expression: a computer solution to face animation , 1986 .

[109]  K. Scherer,et al.  Effect of experimentally induced stress on vocal parameters. , 1986, Journal of experimental psychology. Human perception and performance.

[110]  K. Scherer,et al.  Cues and channels in emotion recognition. , 1986 .

[111]  Julia Hirschberg,et al.  The intonational Structuring of Discourse , 1986, ACL.

[112]  Daniel Thalmann,et al.  The Direction of Synthetic Actors in the Film Rendez-Vous a Montreal , 1987, IEEE Computer Graphics and Applications.

[113]  P. Ekman,et al.  DIFFERENCES Universals and Cultural Differences in the Judgments of Facial Expressions of Emotion , 2004 .

[114]  P. Ekman,et al.  Imitation of Facial Movements in Brain Damaged Patients , 1987, Cortex.

[115]  D. Mowrer,et al.  Analysis of five acoustic correlates of laughter , 1987 .

[116]  John P. Lewis,et al.  Automated lip-synch and speech synthesis for character animation , 1987, CHI 1987.

[117]  Julia Hirschberg,et al.  Intonation and the Intentional Structure of Discourse , 1987, IJCAI.

[118]  D. Massaro Speech Perception By Ear and Eye: A Paradigm for Psychological Inquiry , 1989 .

[119]  Keith Waters,et al.  A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[120]  N. Lass Handbook of speech-language, pathology, and audiology , 1988 .

[121]  Julia Hirschberg,et al.  Assigning Intonational Features in Synthesized Spoken Directions , 1988, ACL.

[122]  P. Ekman,et al.  The role of context in interpreting facial expression: comment on Russell and Fehr (1987). , 1988, Journal of experimental psychology. General.

[123]  Paul Ekman,et al.  The universality of a contempt expression: A replication , 1988 .

[124]  P. Ekman,et al.  Smiles when lying. , 1988, Journal of personality and social psychology.

[125]  Brian Guenter,et al.  A System for Simulating Human Facial Expression , 1989 .

[126]  Janet E. Cahn Generating expression in synthesized speech , 1989 .

[127]  Clea T. Waite,et al.  The facial action control editor, face : a parametric facial expression editor for computer generated animation , 1989 .

[128]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[129]  Eric Moulines,et al.  A diphone synthesis system based on time-domain prosodic modifications of speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[130]  Kiyoharu Aizawa,et al.  An intelligent facial image coding driven by speech and phoneme , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[131]  Christian Abry,et al.  Nineteen (±two) French visemes for visual speech synthesis , 1990, SSW.

[132]  Demetri Terzopoulos,et al.  Physically-based facial modelling, analysis, and animation , 1990, Comput. Animat. Virtual Worlds.

[133]  J. Pierrehumbert,et al.  The Meaning of Intonational Contours in the Interpretation of Discourse , 1990 .

[134]  Dominic W. Massaro,et al.  Synthesis of visible speech , 1990 .

[135]  Monique Nahas,et al.  Creation of a synthetic face speaking in real time with a synthetic voice , 1990, SSW.

[136]  Peter C. Litwinowicz,et al.  Facial Animation by Spatial Mapping , 1991 .

[137]  Tsuneya Kurihara,et al.  A Transformation Method for Modeling and Animation of the Human Face from Photographs , 1991 .

[138]  Norman I. Badler,et al.  Interactive behaviors for bipedal articulated figures , 1991, SIGGRAPH.

[139]  Julia Hirschberg,et al.  Predicting Intonational Phrasing from Text , 1991, ACL.

[140]  Mark Steedman Structure and Intonation , 1991 .

[141]  Munetoshi Unuma,et al.  Generation of Human Motion with Emotion , 1991 .

[142]  M. Argyle,et al.  Gaze and Mutual Gaze , 1994, British Journal of Psychiatry.

[143]  John Liggett THE HUMAN FACE , 2018, Professor at Large.