Speech Perception as a Multimodal Phenomenon

Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal speech information could explain the reported automaticity, immediacy, and completeness of audiovisual speech integration. However, recent findings suggest that speech integration can be influenced by higher cognitive properties such as lexical status and semantic context. Proponents of amodal accounts will need to explain these results.

[1]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[2]  R. Campbell,et al.  Hearing by eye 2 : advances in the psychology of speechreading and auditory-visual speech , 1997 .

[3]  Jennifer M. Fellowes,et al.  Talker identification based on phonetic information. , 1997, Journal of experimental psychology. Human perception and performance.

[4]  E. Bullmore,et al.  Activation of auditory cortex during silent lipreading. , 1997, Science.

[5]  G. Plant Perceiving Talking Faces: From Speech Perception to a Behavioral Principle , 1999 .

[6]  Mikko Sams,et al.  Audio–visual speech perception is special , 2005, Cognition.

[7]  C. Fowler,et al.  Listening with eye and hand: cross-modal contributions to speech perception. , 1991, Journal of experimental psychology. Human perception and performance.

[8]  L. Bernstein,et al.  Audiovisual Speech Binding: Convergence or Association? , 2004 .

[9]  Lawrence D. Rosenblum,et al.  Primacy of Multimodal Speech Perception , 2008 .

[10]  Mikko Sams,et al.  Seeing and hearing others and oneself talk. , 2005, Brain research. Cognitive brain research.

[11]  Lawrence Brancazio,et al.  Lexical influences in audiovisual speech perception. , 2004, Journal of experimental psychology. Human perception and performance.

[12]  L. Rosenblum,et al.  Lip-Read Me Now, Hear Me Better Later , 2006, Psychological science.

[13]  Q. Summerfield Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .

[14]  Stephanie Stokes Perception of visual information for Cantonese tones , 2007 .

[15]  G. A. Calvert,et al.  Hemodynamic studies of audio-visual interactions , 2003 .

[16]  K. Green The Use of Auditory and Visual Information in Phonetic Perception , 1996 .

[17]  C. Fowler,et al.  Listening with eye and hand: Cross-modal contributions to speech perception. , 1991 .

[18]  S. Shimojo,et al.  Sensory modalities are not separate modalities: plasticity and interactions , 2001, Current Opinion in Neurobiology.

[19]  Mikko Sams,et al.  Seeing speech affects acoustic information processing in the human brainstem , 2005, Experimental Brain Research.

[20]  Luigi Cattaneo,et al.  Automatic audiovisual integration in speech perception , 2005, Experimental Brain Research.

[21]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[22]  R. Campbell,et al.  Hearing by eye : the psychology of lip-reading , 1988 .

[23]  Michael S. Gordon,et al.  Effects of intrastimulus modality change on audiovisual time-to-arrival judgments , 2005, Perception & psychophysics.

[24]  C. Spence,et al.  The Handbook of Multisensory Processing , 2004 .

[25]  Joost X. Maier,et al.  Multisensory Integration of Dynamic Faces and Voices in Rhesus Monkey Auditory Cortex , 2005 .

[26]  K. G. Munhall,et al.  Spatial frequency requirements for audiovisual speech perception , 2004, Perception & psychophysics.

[27]  Sabine Windmann,et al.  Effects of Sentence Context and Expectation on the McGurk Illusion. , 2004 .