The effect of lip-reading on primary stream segregation.

Lip-reading has been shown to improve the intelligibility of speech in multitalker situations, where auditory stream segregation naturally takes place. This study investigated whether the benefit of lip-reading is a result of a primary audiovisual interaction that enhances the obligatory streaming mechanism. Two behavioral experiments were conducted involving sequences of French vowels that alternated in fundamental frequency. In Experiment 1, subjects attempted to identify the order of items in a sequence. In Experiment 2, subjects attempted to detect a disruption to temporal isochrony across alternate items. Both tasks are disrupted by streaming, thus providing a measure of primary or obligatory streaming. Visual lip gestures articulating alternate vowels were synchronized with the auditory sequence. Overall, the results were consistent with the hypothesis that visual lip gestures enhance segregation by affecting primary auditory streaming. Moreover, increases in the naturalness of visual lip gestures and auditory vowels, and corresponding increases in audiovisual congruence may potentially lead to increases in the effect of visual lip gestures on streaming.

[1]  M. Sams,et al.  Primary auditory cortex activation by visual speech: an fMRI study at 3 T , 2005, Neuroreport.

[2]  D. Pressnitzer,et al.  Perceptual Organization of Sound Begins in the Auditory Periphery , 2008, Current Biology.

[3]  L. V. Noorden Temporal coherence in the perception of tone sequences , 1975 .

[4]  B. Moore,et al.  Primitive stream segregation of tone sequences without differences in fundamental frequency or passband. , 2002, The Journal of the Acoustical Society of America.

[5]  J. Rauschecker,et al.  Perceptual Organization of Tone Sequences in the Auditory Cortex of Awake Macaques , 2005, Neuron.

[6]  Louis D. Braida,et al.  Evaluating the articulation index for auditory-visual input. , 1987, The Journal of the Acoustical Society of America.

[7]  G. A. Miller,et al.  The Trill Threshold , 1950 .

[8]  Frédéric Berthommier,et al.  A phonetically neutral model of the low-level audio-visual interaction , 2004, Speech Commun..

[9]  H. Scheich,et al.  A multilevel and cross-modal approach towards neuronal mechanisms of auditory streaming , 2008, Brain Research.

[10]  D W Massaro,et al.  American Psychological Association, Inc. Evaluation and Integration of Visual and Auditory Information in Speech Perception , 2022 .

[11]  H. Levitt Transformed up-down methods in psychoacoustics. , 1971, The Journal of the Acoustical Society of America.

[12]  Brian C. J. Moore,et al.  Auditory streaming based on temporal structure in hearing-impaired listeners , 2004, Hearing Research.

[13]  David Poeppel,et al.  Discrimination of auditory-visual synchrony , 2003, AVSP.

[14]  Luc H. Arnal,et al.  Dual Neural Routing of Visual Facilitation in Speech Processing , 2009, The Journal of Neuroscience.

[15]  O. Bertrand,et al.  Visual Activation and Audiovisual Interactions in the Auditory Cortex during Speech Perception: Intracranial Recordings in Humans , 2008, The Journal of Neuroscience.

[16]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[17]  Lynne E. Bernstein,et al.  Auditory speech detection in noise enhanced by lipreading , 2004, Speech Commun..

[18]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[19]  David Poeppel,et al.  Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony , 2004, Speech Commun..

[20]  Mikko Sams,et al.  Processing of changes in visual speech in the human auditory cortex. , 2002, Brain research. Cognitive brain research.

[21]  N. Logothetis,et al.  Visual modulation of neurons in auditory cortex. , 2008, Cerebral cortex.

[22]  Torsten Rahne,et al.  Visual cues can modulate integration and segregation of objects in auditory scene analysis , 2007, Brain Research.

[23]  Lawrence Brancazio,et al.  Lexical influences in audiovisual speech perception. , 2004, Journal of experimental psychology. Human perception and performance.

[24]  E. Bullmore,et al.  Activation of auditory cortex during silent lipreading. , 1997, Science.

[25]  K. Grant,et al.  The effect of speechreading on masked detection thresholds for filtered speech. , 2001, The Journal of the Acoustical Society of America.

[26]  Nicolas Grimault,et al.  Effect of spectral smearing on the perceptual segregation of vowel sequences , 2007, Hearing Research.

[27]  R Cusack,et al.  Effects of differences in timbre on sequential grouping , 2000, Perception & psychophysics.

[28]  T. Rahne,et al.  Visual cues release the temporal coherence of auditory objects in auditory scene analysis , 2009, Brain Research.

[29]  A S Bregman,et al.  Auditory streaming is cumulative. , 1978, Journal of experimental psychology. Human perception and performance.

[30]  B E Walden,et al.  Spectral distribution of prosodic information. , 1996, Journal of speech and hearing research.

[31]  Nicolas Grimault,et al.  Streaming of vowel sequences based on fundamental frequency in a cochlear-implant simulation. , 2008, The Journal of the Acoustical Society of America.

[32]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[33]  Frédéric Berthommier,et al.  Pairing audio speech and various visual displays: binding or not binding? , 2009, AVSP.

[34]  A. Puce,et al.  Neuronal oscillations and visual amplification of speech , 2008, Trends in Cognitive Sciences.

[35]  Lee M. Miller,et al.  Behavioral/systems/cognitive Perceptual Fusion and Stimulus Coincidence in the Cross- Modal Integration of Speech , 2022 .

[36]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[37]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[38]  David Poeppel,et al.  Visual speech speeds up the neural processing of auditory speech. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[39]  A. Oxenham,et al.  Objective and Subjective Psychophysical Measures of Auditory Stream Integration and Segregation , 2010, Journal of the Association for Research in Otolaryngology.