The role of visual information in the processing of

Visual information provided by a talker’s mouth movements can influence the perception of certain speech features. Thus, the “McGurk effect” shows that when the syllable /bi/ is presented audibly, in synchrony with the syllable /gi/, as it is presented visually, a person perceives the talker as saying /di/. Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specified by a combination of auditory and visual information. Members of an auditory continuum ranging from /ibi/ to /ipi/ were paired with a video display of a talker saying /igi/. The auditory tokens were heard as ranging from /ibi/ to /ipi/, but the auditory-visual tokens were perceived as ranging from /idi/ to /iti/. The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly. In three follow-up experiments, we show that (1) the voicing boundary is not shifted in the absence of a change in the global percept, even when discrepant auditory-visual information is presented; (2) the number of response alternatives provided for the subjects does not affect the categorization or the VOT boundary of the auditory-visual stimuli; and (3) the original effect of a VOT boundary shift is not replicated when subjects are forced by instruction to \ldrelabel\rd the /b-p/auditory stimuli as/d/or/t/. The subjects successfully relabeled the stimuli, but no shift in the VOT boundary was observed.

[1]  A. Liberman,et al.  Some Cues for the Distinction Between Voiced and Voiceless Stops in Initial Position , 1958 .

[2]  A M Liberman,et al.  Perceptual equivalence of two acoustic cues for stop-consonant manner , 1980, Perception & psychophysics.

[3]  Q. Summerfield,et al.  On the dissociation of spectral and temporal cues to the voicing distinction in initial stop consonants. , 1977, The Journal of the Acoustical Society of America.

[4]  L. Lisker,et al.  Letter: Is it VOT or a first-formant transition detector? , 1975, The Journal of the Acoustical Society of America.

[5]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[6]  Bruno H. Repp,et al.  Trading relations among acoustic cues in speech perception are largely a result of phonetic categorization , 1983, Speech Commun..

[7]  J D Miller,et al.  Speech perception by the chinchilla: identification function for synthetic VOT stimuli. , 1978, The Journal of the Acoustical Society of America.

[8]  W. Ganong Phonetic categorization in auditory word perception. , 1980, Journal of experimental psychology. Human perception and performance.

[9]  J L Miller,et al.  Nonindependence of feature processing in initial consonants. , 1977, Journal of speech and hearing research.

[10]  D. Massaro,et al.  Evaluation and integration of acoustic features in speech perception. , 1980, The Journal of the Acoustical Society of America.

[11]  J. Godfrey,et al.  Reliability of performance for dichotic listening using two response modes. , 1977, Journal of speech and hearing research.

[12]  Andrew N. Meltzoff,et al.  Factors affecting the integration of auditory and visual information in speech: The level effect , 1988 .

[13]  A. Meltzoff,et al.  The bimodal perception of speech in infancy. , 1982, Science.

[14]  D W Massaro,et al.  American Psychological Association, Inc. Evaluation and Integration of Visual and Auditory Information in Speech Perception , 2022 .

[15]  Q Summerfield,et al.  Audiovisual presentation demonstrates that selective adaptation in speech perception is purely auditory , 1981, Perception & psychophysics.

[16]  R E Pastore,et al.  Discrimination and labeling of noise-buzz sequences with varying noise-lead times: an example of categorical perception. , 1976, The Journal of the Acoustical Society of America.

[17]  Q Summerfield,et al.  Differences between spectral dependencies in auditory and phonetic temporal processing: Relevance to the perception of voicing in initial stops. , 1982, The Journal of the Acoustical Society of America.

[18]  J. L. Miller Phonetic perception: evidence for context-dependent and context-independent processing. , 1981, The Journal of the Acoustical Society of America.

[19]  D. Pisoni Identification and discrimination of the relative onset time of two component tones: implications for voicing perception in stops. , 1977, The Journal of the Acoustical Society of America.

[20]  L. Lisker,et al.  A Cross-Language Study of Voicing in Initial Stops: Acoustical Measurements , 1964 .

[21]  David B Pisoni,et al.  On the identification of place and voicing features in synthetic stop consonants. , 1974, Journal of phonetics.

[22]  Q. Summerfield Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .

[23]  K. Stevens,et al.  Role of formant transitions in the voiced-voiceless distinction for stops. , 1974, The Journal of the Acoustical Society of America.

[24]  P K Kuhl,et al.  Enhanced discriminability at the phonetic boundaries for the voicing feature in macaques , 1982, Perception & psychophysics.

[25]  Ira J. Hirsh,et al.  Auditory Perception of Temporal Order , 1959 .

[26]  Bruno H. Repp,et al.  Exploring the “McGurk effect” , 1983 .

[27]  P K Kuhl,et al.  Theoretical contributions of tests on animals to the special-mechanisms debate in speech. , 1986, Experimental biology.

[28]  A. Liberman On Finding That Speech Is Special , 1982 .

[29]  Patricia K. Kuhl,et al.  The special-mechanisms debate in speech research: Categorization tests on animals and infants. , 1987 .

[30]  J Hillenbrand,et al.  Perception of sine-wave analogs of voice onset time stimuli. , 1984, The Journal of the Acoustical Society of America.

[31]  P. Jusczyk,et al.  Evidence for phonetic processing of cues to place of articulation: Perceived manner affects perceived place , 1981, Perception & psychophysics.

[32]  H. McGurk,et al.  Visual influences on speech perception processes , 1978, Perception & psychophysics.

[33]  E. Parker,et al.  Auditory constraints on the perception of voice-onset time: the influence of lower tone frequency on judgments of tone-onset simultaneity. , 1988, The Journal of the Acoustical Society of America.

[34]  Q Summerfield,et al.  Use of Visual Information for Phonetic Perception , 1979, Phonetica.

[35]  P. D. Eimas,et al.  Selective adaptation of linguistic feature detectors , 1973 .

[36]  P. D. Eimas,et al.  Studies on the categorization of speech by infants , 1983, Cognition.

[37]  Auditory-visual fusions and illusions in speech perception , 1980 .