A unified account of categorical effects in phonetic perception

Categorical effects are found across speech sound categories, with the degree of these effects ranging from extremely strong categorical perception in consonants to nearly continuous perception in vowels. We show that both strong and weak categorical effects can be captured by a unified model. We treat speech perception as a statistical inference problem, assuming that listeners use their knowledge of categories as well as the acoustics of the signal to infer the intended productions of the speaker. Simulations show that the model provides close fits to empirical data, unifying past findings of categorical effects in consonants and vowels and capturing differences in the degree of categorical effects through a single parameter.

[1]  Joanne L. Miller On the internal structure of phonetic categories: a progress report , 1994, Cognition.

[2]  Abeer Alwan,et al.  Acoustic modelling of American English /r/ , 1997, EUROSPEECH.

[3]  Rachel M. Theodore,et al.  Individual talker differences in voice-onset-time: contextual influences. , 2009, The Journal of the Acoustical Society of America.

[4]  D. Haun,et al.  Categorical perception of emotional facial expressions does not require lexical categories. , 2011, Emotion.

[5]  K. Stevens,et al.  Linguistic experience alters phonetic perception in infants by 6 months of age. , 1992, Science.

[6]  Kiyoshi Honda,et al.  Time-varying acoustic and articulatory characteristics of American English [ɹ]: a cross-speaker study , 2003, J. Phonetics.

[7]  Y. Tohkura,et al.  A perceptual interference account of acquisition difficulties for non-native phonemes , 2003, Cognition.

[8]  P. Iverson,et al.  Measuring the perceptual magnet effect in the perception of /i/ by German listeners , 1999, Psychological research.

[9]  M. Schouten,et al.  Categorical perception depends on the discrimination task , 2004, Perception & psychophysics.

[10]  Qin Yan,et al.  Cross-entropic comparison of formants of British, Australian and American English accents , 2008, Speech Commun..

[11]  H. Sussman,et al.  An investigation of locus equations as a source of relational invariance for stop place categorization , 1991 .

[12]  Tristan J. Mahr,et al.  Anticipatory coarticulation facilitates word recognition in toddlers , 2015, Cognition.

[13]  F. Guenther,et al.  The perceptual magnet effect as an emergent property of neural map formation. , 1996, The Journal of the Acoustical Society of America.

[14]  W. M. Thorburn,et al.  THE MYTH OF OCCAM'S RAZOR , 1918 .

[15]  B H Repp,et al.  Two strategies in fricative discrimination , 1981, Perception & psychophysics.

[16]  W. Strange Evolving theories of vowel perception. , 1987, The Journal of the Acoustical Society of America.

[17]  A Faulkner,et al.  Voice-onset Time and Tone-onset Time: The Role of Criterion-setting Mechanisms in Categorical Perception , 1995, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[18]  P. D. Eimas,et al.  Speech Perception in Infants , 1971, Science.

[19]  Joanne L. Miller Internal Structure of Phonetic Categories , 1997 .

[20]  A M Liberman,et al.  Perception of the speech code. , 1967, Psychological review.

[21]  P. Kuhl Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not , 1991, Perception & psychophysics.

[22]  V. Zue,et al.  Acoustic study of medial /t,d/ in American English , 1979 .

[23]  P. Kuhl,et al.  Perceptual magnet and phoneme boundary effects in speech perception: Do they arise from a common mechanism? , 2000, Perception & psychophysics.

[24]  David DeSteno,et al.  Individual talker differences in voice-onset-time. , 2003, The Journal of the Acoustical Society of America.

[25]  P. D. Eimas,et al.  The Relation between Identification and Discrimination along Speech and Non-Speech Continua , 1963 .

[26]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[27]  D. Pisoni Identification and discrimination of the relative onset time of two component tones: implications for voicing perception in stops. , 1977, The Journal of the Acoustical Society of America.

[28]  Jules Davidoff,et al.  Face Familiarity, Distinctiveness, and Categorical Perception , 2008, Quarterly journal of experimental psychology.

[29]  Naomi Feldman,et al.  A Unified Model of Categorical Effects in Consonant and Vowel Perception , 2012, CogSci.

[30]  Francisco Lacerda,et al.  THE PERCEPTUAL-MAGNET EFFECT: AN EMERGENT CONSEQUENCE OF EXEMPLAR-BASED PHONETIC MEMORY , 2003 .

[31]  Reginald B. Adams,et al.  the categoricaL perception oF emotions anD traits , 2009 .

[32]  Emily B. Myers,et al.  The Perception of Voice Onset Time: An fMRI Investigation of Phonetic Category Structure , 2005, Journal of Cognitive Neuroscience.

[33]  A. Liberman,et al.  The role of consonant-vowel transitions in the perception of the stop and nasal consonants. , 1954 .

[34]  B. Dodd,et al.  The perceptual magnet effect in Australian English vowels , 2000, Perception & psychophysics.

[35]  S. Grossberg,et al.  Neural network models of categorical perception , 2000, Perception & psychophysics.

[36]  B. MacWhinney A UNIFIED MODEL , 2007 .

[37]  P. Strevens Spectra of Fricative Noise in Human Speech , 1960 .

[38]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[39]  D. Pisoni Auditory short-term memory and vowel perception , 1975, Memory & cognition.

[40]  R. Smits Hierarchical categorization of coarticulated phonemes: A theoretical analysis , 2001, Perception & psychophysics.

[41]  Dave F. Kleinschmidt,et al.  Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. , 2015, Psychological review.

[42]  D. Perrett,et al.  Categorical Perception of Morphed Facial Expressions , 1996 .

[43]  K. Harris Cues for the Discrimination of American English Fricatives in Spoken Syllables , 1958 .

[44]  B. McMurray,et al.  What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. , 2011, Psychological review.

[45]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[46]  James Hillenbrand,et al.  Static and Dynamic Approaches to Vowel Perception , 2013 .

[47]  J. Rauschecker,et al.  Segregation of Vowels and Consonants in Human Auditory Cortex: Evidence for Distributed Hierarchical Organization , 2010, Front. Psychology.

[48]  S. Harnad Categorical Perception: The Groundwork of Cognition , 1990 .

[49]  C Neuschaefer-Rube,et al.  MODELING THE PERCEPTUAL MAGNET EFFECT AND CATEGORICAL PERCEPTION USING SELF-ORGANIZING NEURAL NETWORKS , 2007 .

[50]  Κ. Ν. Stevens On the Relations between Speech Movements and Speech Perception , 1968 .

[51]  Hubert Truckenbrodt,et al.  Processing German Vowel Quantity: Categorical Perception or Perceptual Magnet Effect? , 2011, ICPhS.

[52]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[53]  W. B. Roantree OCCAM'S RAZOR , 1960 .

[54]  Adam N Sanborn,et al.  Exemplar models as a mechanism for performing Bayesian inference , 2010, Psychonomic bulletin & review.

[55]  M. Treisman,et al.  American Psychological Association, Inc, A Theory of Criterion Setting With an Application to Sequential Dependencies , 2022 .

[56]  B H Repp,et al.  Categories and context in the perception of isolated steady-state vowels. , 1979, Journal of experimental psychology. Human perception and performance.

[57]  Stephen A. Ritz,et al.  Distinctive features, categorical perception, and probability learning: some applications of a neural model , 1977 .

[58]  Naomi H. Feldman,et al.  The influence of categories on perception: explaining the perceptual magnet effect as optimal statistical inference. , 2009, Psychological review.

[59]  N. Macmillan,et al.  The psychophysics of categorical perception. , 1977, Psychological review.

[60]  James L. McClelland,et al.  Unsupervised learning of vowel categories from infant-directed speech , 2007, Proceedings of the National Academy of Sciences.

[61]  J E Sussman,et al.  Phonetic category structure of [I]: extent, best exemplars, and organization. , 1997, Journal of speech, language, and hearing research : JSLHR.

[62]  Susan R. Hertz A model of the regularities underlying speaker variation: evidence from hybrid synthesis , 2006, INTERSPEECH.

[63]  J. Bastian,et al.  Identification and Discrimination of Phonemic Vowel Duration , 1962 .

[64]  D. Pisoni,et al.  Reaction times to comparisons within and across phonetic categories , 1974, Perception & psychophysics.

[65]  J. Werker,et al.  Developmental aspects of cross-language speech perception. , 1981, Child development.

[66]  D. Pisoni,et al.  Categorical and noncategorical modes of speech perception along the voicing continuum. , 1974, The Journal of the Acoustical Society of America.

[67]  C. C. Wood Discriminability, response bias, and phoneme categories in discrimination of voice onset time. , 1976, The Journal of the Acoustical Society of America.

[68]  Shannon L. Barrios Similarity in L2 phonology , 2013 .

[69]  A. Liberman,et al.  The discrimination of relative onset-time of the components of certain speech and nonspeech patterns. , 1961, Journal of experimental psychology.

[70]  G. W. Hughes,et al.  Spectral Properties of Fricative Consonants , 1956 .

[71]  Joseph C. Toscano,et al.  Continuous Perception and Graded Categorization , 2010, Psychological science.

[72]  D B Pisoni,et al.  On prototypes and phonetic categories: a critical assessment of the perceptual magnet effect in speech perception. , 1997, Journal of experimental psychology. Human perception and performance.

[73]  A. Liberman,et al.  Identification and Discrimination of Rounded and Unrounded Vowels , 1963 .

[74]  A. Lotto,et al.  Depolarizing the perceptual magnet effect. , 1998, The Journal of the Acoustical Society of America.

[75]  David Marr,et al.  VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[76]  P Iverson,et al.  Mapping the perceptual magnet effect for speech using signal detection theory and multidimensional scaling. , 1995, The Journal of the Acoustical Society of America.

[77]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[78]  B. C. Griffith,et al.  The discrimination of speech sounds within and across phoneme boundaries. , 1957, Journal of experimental psychology.

[79]  H. Tiitinen,et al.  Modeling the categorical perception of speech sounds: A step toward biological plausibility , 2009, Cognitive, affective & behavioral neuroscience.

[80]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[81]  J. L. Miller,et al.  Effect of speaking rate on the perceptual structure of a phonetic category , 1989, Perception & psychophysics.

[82]  D. Norris,et al.  No lexical–prelexical feedback during speech perception or: Is it time to stop playing those Christmas tapes? , 2009 .

[83]  M. Studdert-Kennedy,et al.  Crosslinguistic Study of Vowel Discrimination , 1964 .

[84]  Emily B. Myers,et al.  Listeners′ sensitivity to talker differences in voice-onset-time: Phonetic boundaries and internal category structure , 2013 .

[85]  A. Liberman,et al.  Identification and Discrimination of a Phonemic Contrast Induced by Silent Interval , 1961 .

[86]  A. C. Gimson,et al.  An introduction to the pronunciation of English , 1991 .

[87]  A. Abramson Identification and Discrimination of Phonemic Tones , 1961 .

[88]  D Kewley-Port,et al.  Time-varying features as correlates of place of articulation in stop consonants. , 1983, The Journal of the Acoustical Society of America.

[89]  N. Viemeister,et al.  Noncategorical perception of stop consonants differing in VOT. , 1977, The Journal of the Acoustical Society of America.

[90]  M. Studdert-Kennedy,et al.  Reaction Time during the Discrimination of Synthetic Stop Consonants , 1964 .

[91]  William S.-Y. Wang,et al.  Frequency Studies of English Consonants , 1960 .

[92]  Michael J Owren,et al.  The relative roles of vowels and consonants in discriminating talker identity versus word meaning. , 2006, The Journal of the Acoustical Society of America.

[93]  M. Studdert-Kennedy,et al.  Reaction Time to Synthetic Stop Consonants and Vowels at Phoneme Centers and at Phoneme Boundaries , 1963 .

[94]  Marc F Joanisse,et al.  Mismatch negativity reflects sensory and phonetic speech processing , 2007, Neuroreport.

[95]  A. Liberman,et al.  Silent Interval as a Cue for the Distinction between Stops and Semivowels in Medial Position , 1959 .

[96]  M. Kilgard,et al.  Different timescales for the neural coding of consonant and vowel sounds. , 2013, Cerebral cortex.

[97]  E Uusipaikka,et al.  Perceptual magnet effect in the light of behavioral and psychophysiological data. , 1997, The Journal of the Acoustical Society of America.

[98]  A. Liberman,et al.  The Identification and Discrimination of Synthetic Vowels , 1962 .

[99]  D Byrd,et al.  Preliminary results on speaker-dependent variation in the TIMIT database. , 1992, The Journal of the Acoustical Society of America.

[100]  L. Lisker,et al.  A Cross-Language Study of Voicing in Initial Stops: Acoustical Measurements , 1964 .

[101]  A. Liberman,et al.  Mimicry and the Perception of a Phonemic Contrast Induced by Silent Interval: Electromyographic and Acoustic Measures , 1961 .

[102]  P. Boersma Praat : doing phonetics by computer (version 4.4.24) , 2006 .

[103]  D. Massaro Speech Perception By Ear and Eye: A Paradigm for Psychological Inquiry , 1989 .

[104]  W. Idsardi,et al.  Categorical effects in fricative perception are reflected in cortical source information , 2015, Brain and Language.

[105]  B. Auditory and phonetic memory codes in the discrimination of consonants and vowels * , 2022 .

[106]  Robert L. Goldstone,et al.  Altering object representations through category learning , 2001, Cognition.

[107]  A. Jongman,et al.  Acoustic characteristics of English fricatives. , 2000, The Journal of the Acoustical Society of America.

[108]  P. Milenkovic,et al.  Statistical analysis of word-initial voiceless obstruents: preliminary data. , 1988, The Journal of the Acoustical Society of America.

[109]  C. J. McGrath,et al.  Effect of exchange rate return on volatility spill-over across trading regions , 2012 .

[110]  D. Gow Assimilation and Anticipation in Continuous Spoken Word Recognition , 2001 .

[111]  Robert L. Goldstone,et al.  Categorical perception. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[112]  J. Flege The production of "new" and "similar" phones in a foreign language: evidence for the effect of equivalence classification , 1987 .

[113]  M. Tanenhaus,et al.  Gradient effects of within-category phonetic variation on lexical access , 2002, Cognition.

[114]  E. B. Newman,et al.  A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[115]  James L. McClelland,et al.  Success and failure of new speech category learning in adulthood: Consequences of learned Hebbian attractors in topographic maps , 2007, Cognitive, affective & behavioral neuroscience.

[116]  F. Keil,et al.  Categorical effects in the perception of faces , 1995, Cognition.

[117]  J. D. Miller,et al.  Evolving theories of vowel perception a ) , 2022 .

[118]  C. Best,et al.  Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants. , 1988, Journal of experimental psychology. Human perception and performance.

[119]  R. Newman,et al.  The perceptual consequences of within-talker variability in fricative production. , 2001, The Journal of the Acoustical Society of America.

[120]  J. Davidoff,et al.  Colour categories in a stone-age tribe , 1999, Nature.

[121]  Dave Kleinschmidt,et al.  Immediate effects of anticipatory coarticulation in spoken-word recognition. , 2014, Journal of memory and language.

[122]  L Polka,et al.  Perceptual equivalence of acoustic cues that differentiate /r/ and /l/. , 1985, The Journal of the Acoustical Society of America.

[123]  James L. McClelland,et al.  Categorization and discrimination of nonspeech sounds: differences between steady-state and rapidly-changing acoustic cues. , 2004, The Journal of the Acoustical Society of America.

[124]  Patricia K. Kohl Early linguistic experience and phonetic perception: implications for theories of developmental speech perception , 1993 .

[125]  Robert Allen Fox,et al.  Cross-Dialectal Differences in Dynamic Formant Patterns in American English Vowels , 2013 .

[126]  P. Kuhl,et al.  Influences of phonetic identification and category goodness on American listeners' perception of /r/ and /l/. , 1996, The Journal of the Acoustical Society of America.

[127]  J L Elman Perceptual origins of the phoneme boundary effect and selective adaptation to speech: a signal detection theory analysis. , 1979, The Journal of the Acoustical Society of America.

[128]  S. Blumstein,et al.  The effect of subphonetic differences on lexical access , 1994, Cognition.

[129]  S. Blumstein,et al.  Invariant cues for place of articulation in stop consonants. , 1978, The Journal of the Acoustical Society of America.

[130]  M A Mines,et al.  Frequency of Occurrence of Phonemes in Conversational English , 1978, Language and speech.