Facial expression and prosodic prominence: Effects of modality and facial area

Abstract This article addresses two related questions regarding the perception of facial markers of prominence in spoken utterances: (1) how important are visual cues to prominence from the face with respect to auditory cues? and (2) are there differences between different facial areas in their cue value for prosodic prominence? The first perception experiment tackles the relation between auditory and visual cues by means of a reaction-time experiment. For this experiment, recordings of a sentence with three prosodically prominent words were systematically manipulated in such a way that auditory and visual cues to prominence were either congruent (occurring on the same word) or incongruent (in that the auditory and the visual cue were positioned on different words). Participants were instructed to indicate as fast as possible which word they perceived as the most prominent one. Results show that participants can more easily determine prominence when the visual cue occurs on the same word as the auditory cue, while displaced visual cues hinder prominence perception. The second experiment investigates which area of a speaker's face contains the strongest cues to prominence, using stimuli with either the entire face visible or only parts of it. The task of the participants was to indicate for each stimulus which word they perceived as the most prominent one. Results show that the upper facial area has stronger cue value for prominence detection than the bottom part, and that the left part of the face is more important than the right part. Results of mirror-images of the original fragments show that this latter result is due both to a speaker and an observer effect.

[1]  D. I. Perrett,et al.  Are the perceptual biases found in chimeric face processing reflected in eye-movement patterns? , 2005, Neuropsychologia.

[2]  Marc Swerts,et al.  More About Brows , 2004, From Brows to Trust.

[3]  Judy P. Walker,et al.  Hemispheric specialisation in processing prosodic structures: Revisited , 2002 .

[4]  D. Massaro Multimodal Speech Perception: A Paradigm for Speech Science , 2002 .

[5]  Ken W Grant,et al.  Hearing by Eye II: Advances in the Psychology of Speechreading and Auditory–Visual Speech, edited by Ruth Campbell, Barbara Dodd, and Denis Burnham , 1999, Trends in Cognitive Sciences.

[6]  E K Janzen,et al.  A balanced smile--a most important treatment objective. , 1977, American journal of orthodontics.

[7]  Jean-Luc Schwartz,et al.  Visual perception of contrastive focus in reiterant French speech , 2004, Speech Commun..

[8]  Roxane Bertrand,et al.  About the relationship between eyebrow movements and Fo variations , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Arvid Kappas,et al.  Multichannel communication of emotion: Synthetic signal production. , 1988 .

[10]  G. McConkie,et al.  Attention to facial regions in segmental and prosodic visual speech perception tasks. , 1999, Journal of speech, language, and hearing research : JSLHR.

[11]  Björn Granström,et al.  Timing and interaction of visual cues for prominence in audiovisual speech perception , 2001, INTERSPEECH.

[12]  Shigeru Yamane,et al.  Dominance of the left oblique view in activating the cortical network for face recognition , 2004, Neuroscience Research.

[13]  P. Bertelson,et al.  The ventriloquist effect does not depend on the direction of deliberate visual attention , 2000, Perception & psychophysics.

[14]  Volker Strom,et al.  Visual prosody: facial movements accompanying speech , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[15]  Björn Granström,et al.  PROSODIC CUES IN MULTIMODAL SPEECH PERCEPTION , 1999 .

[16]  Akihiro Tanaka,et al.  Effect of speed difference between time-expanded speech and talker2s moving image on word or sentence intelligibility , 2007, AVSP.

[17]  Timothy R. Jordan,et al.  Effects of Distance on Visual and Audiovisual Speech Recognition , 2000 .

[18]  O. Fujimura,et al.  Articulatory Correlates of Prosodic Control: Emotion and Emphasis , 1998, Language and speech.

[19]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[20]  Björn Granström,et al.  Multimodality in Language and Speech Systems , 2002 .

[21]  Loraine K. Obler,et al.  Right hemisphere specialization for the identification of emotional words and sentences: Evidence from stroke patients , 1992, Neuropsychologia.

[22]  Wendy S. Grolnick,et al.  Discrimination of vocal expressions by young infants , 1983 .

[23]  I. Marlien,et al.  Proceedings of Speech Prosody 2004, Nara, Japan , 2004 .

[24]  Jeffery A. Jones,et al.  Visual Prosody and Speech Intelligibility , 2004, Psychological science.

[25]  Björn Granström,et al.  Visual correlates to prominence in several expressive modes , 2006, INTERSPEECH.

[26]  M. Swerts,et al.  Manipulating Uncertainty The contribution of different audiovisual prosodic cues to the perception of confidence , 2006 .

[27]  P. Keating,et al.  Optical Phonetics and Visual Perception of Lexical and Phrasal Stress in English , 2009, Language and speech.

[28]  E. Koff,et al.  Facial asymmetry during emotional expression: Gender, valence, and measurement technique , 1998, Neuropsychologia.

[29]  S R Baum,et al.  The Ability of Right- and Left-Hemisphere-Damaged Individuals to Produce and Interpret Prosodic Cues Marking Phrasal Boundaries , 1997, Language and speech.

[30]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[31]  D. Massaro Perceiving talking faces: from speech perception to a behavioral principle , 1999 .

[32]  Marion Dohen,et al.  Deixis prosodique multisensorielle : production et perception audiovisuelle de la focalisation contrastive en français , 2005 .

[33]  Paul Bertelson,et al.  The role of face parts : the perception of emotions in the voice and face , 1999 .

[34]  Mark Steedman,et al.  Generating Facial Expressions for Speech , 1996, Cogn. Sci..

[35]  D W Massaro,et al.  Perception of asynchronous and conflicting visual and auditory speech. , 1996, The Journal of the Acoustical Society of America.

[36]  D. Massaro,et al.  Perceiving Prosody from the Face and Voice: Distinguishing Statements from Echoic Questions in English , 2003, Language and speech.

[37]  R. Campbell,et al.  Hearing by eye 2 : advances in the psychology of speechreading and auditory-visual speech , 1997 .

[38]  M. Tanenhaus,et al.  Accent and reference resolution in spoken-language comprehension , 2002 .

[39]  Emiel Krahmer,et al.  On the alleged existence of contrastive accents , 2001, Speech Commun..

[40]  Sieb G. Nooteboom,et al.  Opposite effects of accentuation and deaccentuation on verification latencies for given and new information , 1987 .

[41]  H. H. Rump,et al.  The perceptual prominence of fundamental frequency peaks. , 1997, The Journal of the Acoustical Society of America.

[42]  K. Heilman,et al.  Digitizing the moving face during dynamic displays of emotion , 2000, Neuropsychologia.

[43]  O. Grüsser,et al.  Gaze motor asymmetries in the perception of faces during a memory task , 1993, Neuropsychologia.

[44]  Takaaki Kuratate,et al.  Linking facial animation, head motion and speech acoustics , 2002, J. Phonetics.

[45]  Steven van de Par,et al.  Auditory-visual interaction: from fundamental research in cognitive psychology to (possible) applications , 1999, Electronic Imaging.

[46]  A. Mehrabian,et al.  Inference of attitudes from nonverbal communication in two channels. , 1967, Journal of consulting psychology.

[47]  Gilles Pourtois,et al.  Facial expressions modulate the time course of long latency auditory brain potentials. , 2002, Brain research. Cognitive brain research.

[48]  D. Massaro,et al.  Perceiving affect from the voice and the face , 1996, Psychonomic bulletin & review.

[49]  Ronald L. Boring,et al.  The Distribution of Attention Across a Talker's Face , 2004 .

[50]  Jeesun Kim,et al.  Audio-visual speech perception off the top of the head , 2006, Cognition.

[51]  M. Swerts,et al.  The Effects of Visual Beats on Prosodic Prominence: Acoustic Analyses, Auditory Perception and Visual Perception. , 2007 .

[52]  Taehong Cho,et al.  Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress , 2005, J. Phonetics.

[53]  John L. Bradshaw,et al.  Read My Lips , 2004, Psychological science.

[54]  N. P. Erber,et al.  Effects of angle, distance, and illumination on visual reception of speech by profoundly deaf children. , 1974, Journal of speech and hearing research.

[55]  Y. Joanette,et al.  Right Hemisphere and Verbal Communication , 1989 .

[56]  Emiel Krahmer,et al.  Pitch, eyebrows and the perception of focus , 2002, Speech Prosody 2002.

[57]  M. Swerts,et al.  Congruent and incongruent audiovisual cues to prominence , 2004, Speech Prosody 2004.

[58]  Eric Vatikiotis-Bateson,et al.  The moving face during speech communication , 1998 .