Puzzle-solving science: the quixotic quest for units in speech perception

Although speech signals are continuous and variable, listeners experience segmentation and linguistic structure in perception. For years, researchers have tried to identify the basic building-block of speech perception. In that time, experimental methods have evolved, constraints on stimulus materials have evolved, sources of variance have been identified, and computational models have been advanced. As a result, the slate of candidate units has increased, each with its own empirical support. In this article, we endorse Grossberg's adaptive resonance theory (ART), proposing that speech units are emergent properties of perceptual dynamics. By this view, units only "exist" when disparate features achieve resonance, a level of perceptual coherence that allows conscious encoding. We outline basic principles of ART, then summarize five experiments. Three experiments assessed the power of social influence to affect phoneme-syllable competitions. Two other experiments assessed repetition effects in monitoring data. Together the data suggest that "primary" speech units are strongly and symmetrically affected by bottom-up and top-down knowledge sources.

[1]  S. Grossberg,et al.  Neural dynamics of variable-rate speech categorization. , 1997, Journal of experimental psychology. Human perception and performance.

[2]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[3]  D. Swinney Lexical access during sentence comprehension: (Re)consideration of context effects , 1979 .

[4]  D. Pisoni,et al.  Acoustic-phonetic representations in word recognition , 1987, Cognition.

[5]  S. Goldinger Echoes of echoes? An episodic theory of lexical access. , 1998, Psychological review.

[6]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[7]  D. Swinney,et al.  On the Psychological Reality of the Phoneme: Perception, Identification, and Consciousness. , 1973 .

[8]  A. Healy,et al.  Units of speech perception: Phoneme and syllable , 1976 .

[9]  A G Samuel,et al.  Insights from a failure of selective adaptation: Syllable-initial and syllable-final consonants are different , 1989, Perception & psychophysics.

[10]  Richard M. Warren,et al.  Identification times for phonemic components of graded complexity and for spelling of speech , 1971 .

[11]  M. J. Intons-Peterson,et al.  Imagery paradigms: how vulnerable are they to experimenters' expectations? , 1983, Journal of experimental psychology. Human perception and performance.

[12]  A. Samuel,et al.  Attention within Auditory Word Perception. , 1985 .

[13]  G. C. Orden,et al.  Interdependence of form and function in cognitive systems explains perception of printed words. , 1994, Journal of experimental psychology. Human perception and performance.

[14]  Donald G. MacKay,et al.  Relations Between Language and Memory: The Case of Repetition Deafness , 1996 .

[15]  S. M. Sheffert,et al.  Voice-specificity effects on auditory word priming , 1998, Memory & cognition.

[16]  A. Cutler Phoneme-monitoring reaction time as a function of preceding intonation contour , 1976 .

[17]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[18]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[19]  U. Frauenfelder,et al.  The Role of the Syllable in Language Acquisition and Perception , 1981 .

[20]  C. Harris,et al.  Illusory words created by repetition blindness: A technique for probing sublexical representations , 2001, Psychonomic bulletin & review.

[21]  Geoffrey E. Hinton,et al.  Lesioning an attractor network: investigations of acquired dyslexia , 1991 .

[22]  Competition in spoken word recognition: Spotting words in other words , 1994 .

[23]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[24]  J. Morais,et al.  Intermediate Representations in Spoken Word Recognition; Evidence from Word Illusions , 1995 .

[25]  Robert E. Remez,et al.  Establishing and maintaining perceptual coherence: unimodal and multimodal evidence , 2003, J. Phonetics.

[26]  S. Goldinger Words and voices: episodic traces in spoken word identification and recognition memory. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[27]  M. Turvey,et al.  Initial phonemes are detected faster in spoken words than in spoken nonwords , 1976 .

[28]  T. M. Nearey,et al.  Speech perception as pattern recognition. , 1997, The Journal of the Acoustical Society of America.

[29]  R. Plomp,et al.  The Intelligent Ear: On the Nature of Sound Perception , 2001 .

[30]  Rebecca Treiman,et al.  Relationships between sounds and letters in English monosyllables , 2001 .

[31]  D. Massaro Preperceptual images, processing time, and perceptual units in auditory perception. , 1972, Psychological review.

[32]  Anne Cutler,et al.  The syllable's differing role in the segmentation of French and English. , 1986 .

[33]  D Norris,et al.  Merging information in speech recognition: Feedback is never necessary , 2000, Behavioral and Brain Sciences.

[34]  Tamiko Azuma,et al.  Open wide and say blah! attentional dynamics of delayed naming , 1997 .

[35]  W Marslen-Wilson,et al.  Levels of perceptual representation and process in lexical access: words, phonemes, and features. , 1994, Psychological review.

[36]  Emmanuel Dupoux Contrasting syllabic effects in Catalan and Spanish , 1992 .

[37]  Douglas L. Hintzman,et al.  Judgments of frequency and recognition memory in a multiple-trace memory model. , 1988 .

[38]  Gerald Sommer,et al.  Pattern Recognition by Self-Organizing Neural Networks , 1994 .

[39]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[40]  A. Cutler,et al.  Mora or Phoneme? Further Evidence for Language-Specific Listening , 1994 .

[41]  John Laver,et al.  The Cognitive Representation of Speech , 1982 .

[42]  S. Grossberg,et al.  Neural dynamics of perceptual order and context effects for variable-rate speech syllables , 1999, Perception & psychophysics.

[43]  A G Samuel,et al.  Knowing a Word Affects the Fundamental Perception of The Sounds Within it , 2001, Psychological science.

[44]  S. Decoene,et al.  Testing the speech unit hypothesis with the primed matching task: Phoneme categories are perceptually basic , 1993, Perception & psychophysics.

[45]  G O Stone,et al.  Reading homographs: orthographic, phonologic, and semantic dynamics. , 1999, Journal of experimental psychology. Human perception and performance.

[46]  B. Repp,et al.  Can speech perception be influenced by simultaneous presentation of print , 1988 .

[47]  T. Bever,et al.  The nonperceptual reality of the phoneme. , 1970 .

[48]  D. Schacter,et al.  Perceptual specificity of auditory priming: implicit memory for voice intonation and fundamental frequency. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[49]  S. Goldinger,et al.  Form-based priming in spoken word recognition: the roles of competition and bias. , 1992, Journal of Experimental Psychology. Learning, Memory and Cognition.

[50]  R. Shillcock,et al.  The recognition of words after their acoustic offsets in spontaneous speech: Effects of subsequent context , 1988, Perception & psychophysics.

[51]  C. B. Walker,et al.  Units of Speech Perception , 1971 .

[52]  D. Pisoni,et al.  Effects of talker, rate, and amplitude variation on recognition memory for spoken words , 1999, Perception & psychophysics.

[53]  A. Healy,et al.  Letter detection: A window to unitization and other cognitive processes in reading text , 1994, Psychonomic bulletin & review.

[54]  A G Samuel,et al.  Attention within auditory word perception: insights from the phonemic restoration illusion. , 1986, Journal of experimental psychology. Human perception and performance.

[55]  Jeffrey S. Bowers,et al.  Rethinking Implicit Memory , 2002 .

[56]  Stephen Grossberg,et al.  Resonant neural dynamics of speech perception , 2003, J. Phonetics.

[57]  David McNeill,et al.  The Perceptual Reality of Phonemes, Syllables, Words, and Sentences. , 1973 .

[58]  P. Luce,et al.  Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition , 1999 .

[59]  D. Swinney,et al.  Phonemic identification in a phoneme monitoring experiment: The variable role of uncertainty about vowel contexts , 1980, Perception & psychophysics.

[60]  S. Grossberg How does a brain build a cognitive code , 1980 .

[61]  Isabel Tackett,et al.  Words and Voices , 1953 .

[62]  C. Mills,et al.  Effects of the match between listener expectancies and coarticulatory cues on the perception of speech. , 1980, Journal of experimental psychology. Human perception and performance.

[63]  R M Shiffrin,et al.  Episodic and lexical contributions to the repetition effect in word identification. , 1983, Journal of experimental psychology. General.

[64]  P. Tabossi,et al.  Syllables in the processing of spoken Italian. , 2000, Journal of experimental psychology. Human perception and performance.

[65]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[66]  S. Grossberg The Link between Brain Learning, Attention, and Consciousness , 1999, Consciousness and Cognition.

[67]  S. Grossberg,et al.  The resonant dynamics of speech perception: interword integration and duration-dependent backward effects. , 2000, Psychological review.

[68]  Michael A. Shand,et al.  Syllabic vs segmental perception: On the inability to ignore “irrelevant” stimulus parameters , 1976 .

[69]  D. Norris,et al.  The relative accessibility of phonemes and syllables , 1988, Perception & psychophysics.