Auditory Supplements to Speechreading

Auditory-visual speech recognition is far more accurate and robust than speech recognition by hearing alone. Yet, in spite of the benefits and obvious importance of auditory-visual speech for everyday communication, little is known about the mechanisms involved in auditoryvisual speech integration. As a preliminary step toward the development of a generalized model of speech communication that incorporates visual speech cues, it is necessary to delineate the spectral and temporal interactions that occur when visual speech cues are used in tandem with acoustic cues. It will be shown that this interaction is both highly synergistic and non-linear. Further, it is suggested that visual speech cues may serve as a guide for auditory speech processing by informing the listener of spectral and temporal landmarks that can be used to decode the speech message.

[1]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[2]  H. Fletcher,et al.  The Perception of Speech and Its Relation to Telephony , 1950 .

[3]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[4]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[5]  R. Mazéas Hearing capacity, its measurement and calculation. , 1968, American annals of the deaf.

[6]  B L Scott,et al.  A method for training and evaluating the reception of ongoing speech. , 1978, The Journal of the Acoustical Society of America.

[7]  T. Houtgast,et al.  Predicting speech intelligibility in rooms from the modulation transfer function, I. General room acoustics , 1980 .

[8]  Brian C. J. Moore,et al.  Voice pitch as an aid to lipreading , 1981, Nature.

[9]  Joseph W. Hall,et al.  Detection in noise by spectro-temporal pattern analysis. , 1984, The Journal of the Acoustical Society of America.

[10]  M. Breeuwer,et al.  Speechreading supplemented with frequency‐selective sound‐pressure information , 1984 .

[11]  P K Kuhl,et al.  The contribution of fundamental frequency, amplitude envelope, and voicing duration cues to speechreading in normal-hearing subjects. , 1985, The Journal of the Acoustical Society of America.

[12]  P K Kuhl,et al.  The transmission of prosodic information via an electrotactile speechreading aid. , 1986, Ear and hearing.

[13]  Q. Summerfield Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .

[14]  D. Massaro Speech Perception By Ear and Eye: A Paradigm for Psychological Inquiry , 1989 .

[15]  Louis D. Braida,et al.  Evaluating the articulation index for auditory-visual input. , 1987, The Journal of the Acoustical Society of America.

[16]  L. Braida Crossmodal Integration in the Identification of Consonant Segments , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[17]  Ken W. Grant,et al.  Erratum: ‘‘Evaluating the articulation index for auditory–visual input’’ [J. Acoust. Soc. Am. 89, 2952–2960 (1991)] , 1991 .

[18]  L D Braida,et al.  Single Band Amplitude Envelope Cues as an Aid to Speechreading , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[19]  L D Braida,et al.  Auditory supplements to speechreading: combining amplitude envelope cues from different spectral regions of speech. , 1992, The Journal of the Acoustical Society of America.

[20]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[21]  B. Lindblom,et al.  Role of articulation in speech perception: clues from production. , 1996, The Journal of the Acoustical Society of America.

[22]  B E Walden,et al.  Evaluating the articulation index for auditory-visual consonant recognition. , 1996, The Journal of the Acoustical Society of America.

[23]  Misha Pavel,et al.  Intelligibility of speech with filtered time trajectories of spectral envelopes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[24]  B. Stein,et al.  Spatial determinants of multisensory integration in cat superior colliculus neurons. , 1996, Journal of neurophysiology.

[25]  Steven Greenberg,et al.  ON THE ORIGINS OF SPEECH INTELLIGIBILITY IN THE REAL WORLD , 1997 .

[26]  G. Plant Perceiving Talking Faces: From Speech Perception to a Behavioral Principle , 1999 .

[27]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[28]  K. Grant,et al.  The effect of speechreading on masked detection thresholds for filtered speech. , 2001, The Journal of the Acoustical Society of America.