Attention Is Required for Knowledge-Based Sequential Grouping: Insights from the Integration of Syllables into Words

How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top–down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top–down attention. SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention.

[1]  S. Shamma On the role of space and time in auditory processing , 2001, Trends in Cognitive Sciences.

[2]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[3]  N. Lavie Distracted and confused?: Selective attention under load , 2005, Trends in Cognitive Sciences.

[4]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[5]  Matthew H. Davis,et al.  Lexical Influences on Auditory Streaming , 2013, Current Biology.

[6]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[7]  Holger Mitterer,et al.  How does cognitive load influence speech perception? An encoding hypothesis , 2017, Attention, perception & psychophysics.

[8]  J. Lisman,et al.  The Theta-Gamma Neural Code , 2013, Neuron.

[9]  M. Kutas,et al.  Semantic processing and memory for attended and unattended words in dichotic listening: behavioral and electrophysiological evidence. , 1995, Journal of experimental psychology. Human perception and performance.

[10]  Philippe Peigneux,et al.  Auditory Magnetoencephalographic Frequency-Tagged Responses Mirror the Ongoing Segmentation Processes Underlying Statistical Learning , 2016, Brain Topography.

[11]  Lisa D. Sanders,et al.  Event-related potentials index segmentation of nonsense sounds , 2009, Neuropsychologia.

[12]  Lisa D. Sanders,et al.  Listeners modulate temporally selective attention during natural speech processing , 2009, Biological Psychology.

[13]  Karl J. Friston,et al.  Brain responses in humans reveal ideal observer-like sensitivity to complex acoustic patterns , 2016, Proceedings of the National Academy of Sciences.

[14]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[15]  Colin M. Brown,et al.  The N400 as a function of the level of processing. , 1995, Psychophysiology.

[16]  M. Cooke,et al.  Recognizing speech under a processing load: Dissociating energetic from informational factors , 2009, Cognitive Psychology.

[17]  Lucia Melloni,et al.  Brain Oscillations during Spoken Sentence Processing , 2012, Journal of Cognitive Neuroscience.

[18]  D. Poeppel,et al.  Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party” , 2013, Neuron.

[19]  M. Berger,et al.  High Gamma Power Is Phase-Locked to Theta Oscillations in Human Neocortex , 2006, Science.

[20]  Ankoor S. Shah,et al.  An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. , 2005, Journal of neurophysiology.

[21]  T W Picton,et al.  Human auditory steady-state evoked potentials during selective attention. , 1987, Electroencephalography and clinical neurophysiology.

[22]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[23]  Mark F. Bear,et al.  Learned spatiotemporal sequence recognition and prediction in primary visual cortex , 2014, Nature Neuroscience.

[24]  I. Winkler,et al.  The role of attention in the formation of auditory streams , 2007, Perception & psychophysics.

[25]  A Treisman,et al.  Semantic processing in dichotic listening? A replication , 1974, Memory & cognition.

[26]  Andrea E Martin,et al.  A mechanism for the cortical computation of hierarchical linguistic structure , 2017, PLoS biology.

[27]  Lars Meyer,et al.  Linguistic Bias Modulates Interpretation of Speech via Neural Delta-Band Oscillations , 2016, Cerebral cortex.

[28]  Virginia Best,et al.  Auditory Object Formation and Selection , 2017 .

[29]  D. Poeppel,et al.  Cortical Tracking of Hierarchical Linguistic Structures in Connected Speech , 2015, Nature Neuroscience.

[30]  Marco Buiatti,et al.  Investigating the neural correlates of continuous speech computation with frequency-tagged neuroelectric responses , 2009, NeuroImage.

[31]  Luc H. Arnal,et al.  Cortical oscillations and sensory predictions , 2012, Trends in Cognitive Sciences.

[32]  A. Friederici Towards a neural basis of auditory sentence processing , 2002, Trends in Cognitive Sciences.

[33]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[34]  N. Lavie Perceptual load as a necessary condition for selective attention. , 1995, Journal of experimental psychology. Human perception and performance.

[35]  Daniel Holender,et al.  Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: A survey and appraisal , 1986, Behavioral and Brain Sciences.

[36]  Michael F. Bunting,et al.  The cocktail party phenomenon revisited: The importance of working memory capacity , 2001, Psychonomic bulletin & review.

[37]  Y. Nir,et al.  Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing , 2017, The Journal of Neuroscience.

[38]  B. Shinn-Cunningham Object-based auditory and visual attention , 2008, Trends in Cognitive Sciences.

[39]  Wen Zhang,et al.  Time-domain analysis of neural tracking of hierarchical linguistic structures , 2017, NeuroImage.

[40]  J. Fodor,et al.  The Modularity of Mind: An Essay on Faculty Psychology , 1984 .

[41]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[42]  Angela D. Friederici,et al.  Brain potentials indicate immediate use of prosodic cues in natural speech processing , 1999, Nature Neuroscience.

[43]  Aniruddh D. Patel Music, Language, and the Brain , 2007 .

[44]  G. McCarthy,et al.  Language-related field potentials in the anterior-medial temporal lobe: II. Effects of word type and semantic priming , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[45]  S. Dehaene,et al.  Unconscious Masked Priming Depends on Temporal Attention , 2002, Psychological science.

[46]  K. Lashley The problem of serial order in behavior , 1951 .

[47]  E. Vogel,et al.  Word meanings can be accessed but not reported during the attentional blink , 1996, Nature.

[48]  S. Dehaene,et al.  Cortical representation of the constituent structure of sentences , 2011, Proceedings of the National Academy of Sciences.

[49]  Terence W. Picton,et al.  Effects of Attention on Neuroelectric Correlates of Auditory Stream Segregation , 2006, Journal of Cognitive Neuroscience.

[50]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[51]  Jessica M. Foxton,et al.  Effects of attention and unilateral neglect on auditory stream segregation. , 2001, Journal of experimental psychology. Human perception and performance.

[52]  C. Eulitz,et al.  Top-down knowledge supports the retrieval of lexical information from degraded speech , 2007, Brain Research.

[53]  Antoine J. Shahin,et al.  Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party” , 2010, The Journal of Neuroscience.

[54]  Kai Lu,et al.  Temporal coherence structure rapidly shapes neuronal interactions , 2017, Nature Communications.

[55]  J. Rauschecker,et al.  Perceptual Organization of Tone Sequences in the Auditory Cortex of Awake Macaques , 2005, Neuron.

[56]  R. Näätänen,et al.  The mismatch negativity (MMN) in basic research of central auditory processing: A review , 2007, Clinical Neurophysiology.

[57]  J. Lewis,et al.  Semantic processing of unattended messages using dichotic listening. , 1970, Journal of experimental psychology.

[58]  U. Goswami,et al.  Speech rhythm and temporal structure: Converging perspectives? , 2013 .

[59]  Josh H. McDermott,et al.  Attentive Tracking of Sound Sources , 2015, Current Biology.

[60]  Jonathan Z. Simon,et al.  Power and phase properties of oscillatory neural responses in the presence of background activity , 2012, Journal of Computational Neuroscience.

[61]  Q. Summerfield Book Review: Auditory Scene Analysis: The Perceptual Organization of Sound , 1992 .

[62]  John J. Foxe,et al.  At what time is the cocktail party? A late locus of selective attention to natural speech , 2012, The European journal of neuroscience.

[63]  David Poeppel,et al.  Visual speech speeds up the neural processing of auditory speech. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[64]  N. Cowan,et al.  The cocktail party phenomenon revisited: how frequent are attention shifts to one's name in an irrelevant auditory channel? , 1995, Journal of experimental psychology. Learning, memory, and cognition.

[65]  Kirill V. Nourski,et al.  Representation of speech in human auditory cortex: Is it special? , 2013, Hearing Research.

[66]  Charles E. Schroeder,et al.  Motor contributions to the temporal precision of auditory attention , 2014, Nature Communications.

[67]  Josh H McDermott,et al.  Recovering sound sources from embedded repetition , 2011, Proceedings of the National Academy of Sciences.

[68]  J. R. Doyle Semantic activation without conscious identification in dichotic listening , parafoveal vision , and visual masking : A survey and appraisal , 2008 .

[69]  S. David,et al.  Does attention play a role in dynamic receptive field adaptation to changing acoustic salience in A1? , 2007, Hearing Research.

[70]  Stanislas Dehaene,et al.  Neurophysiological dynamics of phrase-structure building during sentence processing , 2017, Proceedings of the National Academy of Sciences.

[71]  Anne Cutler,et al.  Native Listening: Language Experience and the Recognition of Spoken Words , 2012 .

[72]  C. Schroeder,et al.  Low-frequency neuronal oscillations as instruments of sensory selection , 2009, Trends in Neurosciences.

[73]  György Buzsáki,et al.  Neural Syntax: Cell Assemblies, Synapsembles, and Readers , 2010, Neuron.

[74]  S. Mattys,et al.  Effects of cognitive load on speech recognition , 2011 .

[75]  Joachim Gross,et al.  Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension , 2012, Cerebral cortex.

[76]  David Poeppel,et al.  Characterizing Neural Entrainment to Hierarchical Linguistic Units using Electroencephalography (EEG) , 2017, Front. Hum. Neurosci..

[77]  S. Shamma,et al.  Temporal coherence and attention in auditory scene analysis , 2011, Trends in Neurosciences.

[78]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[79]  Riitta Hari,et al.  Neuromagnetic Responses to Frequency-Tagged Sounds: A New Method to Follow Inputs from Each Ear to the Human Auditory Cortex during Binaural Hearing , 2002, The Journal of Neuroscience.

[80]  R. Freyman,et al.  Effect of Priming on Energetic and Informational Masking in a Same–Different Task , 2012, Ear and hearing.