Tracking perception of the sounds of English.

Twenty American English listeners identified gated fragments of all 2288 possible English within-word and cross-word diphones, providing a total of 538,560 phoneme categorizations. The results show orderly uptake of acoustic information in the signal and provide a view of where information about segments occurs in time. Information locus depends on each speech sound's identity and phonological features. Affricates and diphthongs have highly localized information so that listeners' perceptual accuracy rises during a confined time range. Stops and sonorants have more distributed and gradually appearing information. The identity and phonological features (e.g., vowel vs consonant) of the neighboring segment also influences when acoustic information about a segment is available. Stressed vowels are perceived significantly more accurately than unstressed vowels, but this effect is greater for lax vowels than for tense vowels or diphthongs. The dataset charts the availability of perceptual cues to segment identity across time for the full phoneme repertoire of English in all attested phonetic contexts.

[1]  A Cutler,et al.  The strong/weak syllable distinction in English. , 1995, The Journal of the Acoustical Society of America.

[2]  A. Jongman,et al.  Contributions of semantic and facial information to perception of nonsibilant fricatives. , 2003, Journal of speech, language, and hearing research : JSLHR.

[3]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[4]  Jont B. Allen,et al.  Consonant confusions in white noise. , 2008, The Journal of the Acoustical Society of America.

[5]  Kanae Nishi,et al.  Children's recognition of American English consonants in noise. , 2007, The Journal of the Acoustical Society of America.

[6]  T. M. Nearey,et al.  Identification of resynthesized /hVd/ utterances: effects of formant contour. , 1999, The Journal of the Acoustical Society of America.

[7]  C Peláez-Moreno,et al.  Analyzing phonetic confusions using formal concept analysis. , 2010, The Journal of the Acoustical Society of America.

[8]  M. Fourakis,et al.  Tempo, stress, and vowel reduction in American English. , 1991, The Journal of the Acoustical Society of America.

[9]  A. Jongman,et al.  Acoustic characteristics of English fricatives. , 2000, The Journal of the Acoustical Society of America.

[10]  Steven Greenberg,et al.  Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation , 1999, Speech Commun..

[11]  José Benkí,et al.  Analysis of English Nonsense Syllable Recognition in Noise , 2003, Phonetica.

[12]  Anne Cutler,et al.  Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception , 2005, Speech Commun..

[13]  D. Norris,et al.  Shortlist B: a Bayesian model of continuous speech recognition. , 2008, Psychological review.

[14]  S. Furui On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.

[15]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[16]  G. Studebaker,et al.  Supplementary formulas and tables for calculating and interconverting speech recognition scores in transformed arcsine units , 2004, International journal of audiology.

[17]  B. Ooijen Vowel mutability and lexical selection in English: Evidence from a word reconstruction task , 1996 .

[18]  D. Massaro,et al.  The temporal distribution of information in audiovisual spoken-word identification , 2010, Attention, perception & psychophysics.

[19]  John J. Ohala,et al.  Phonology and Phonetic Evidence: Speech perception and lexical representation: the role of vowel nasalization in Hindi and English , 1995 .

[20]  David B. Pisoni,et al.  Speech perception, word recognition and the structure of the lexicon , 1985, Speech Commun..

[21]  B van Ooijen,et al.  Vowel mutability and lexical selection in English: evidence from a word reconstruction task. , 1996, Memory & cognition.

[22]  S. Ohman,et al.  Perception of segments of VCCV utterances. , 1966, The Journal of the Acoustical Society of America.

[23]  Anne Cutler,et al.  Unfolding of phonetic information over time: a database of Dutch diphone perception. , 2003, The Journal of the Acoustical Society of America.

[24]  Roel Smits,et al.  Temporal distribution of information for human consonant recognition in VCV utterances , 2000, J. Phonetics.

[25]  M E Schouten,et al.  Identification of deleted consonants. , 1978, The Journal of the Acoustical Society of America.

[26]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .