Neocortical activity tracks the hierarchical linguistic structures of self-produced speech during reading aloud

How the human brain uses self-generated auditory information during speech production is rather unsettled. Current theories of language production consider a feedback monitoring system that monitors the auditory consequences of speech output and an internal monitoring system, which makes predictions about the auditory consequences of speech before its production. To gain novel insights into underlying neural processes, we investigated the coupling between neuromagnetic activity and the temporal envelope of the heard speech sounds (i.e., cortical tracking of speech) in a group of adults who 1) read a text aloud, 2) listened to a recording of their own speech (i.e., playback), and 3) listened to another speech recording. Reading aloud was here used as a particular form of speech production that shares various processes with natural speech. During reading aloud, the reader's brain tracked the slow temporal fluctuations of the speech output. Specifically, auditory cortices tracked phrases (<1 Hz) but to a lesser extent than during the two speech listening conditions. Also, the tracking of words (2-4 Hz) and syllables (4-8 Hz) occurred at parietal opercula during reading aloud and at auditory cortices during listening. Directionality analyses were then used to get insights into the monitoring systems involved in the processing of self-generated auditory information. Analyses revealed that the cortical tracking of speech at <1 Hz, 2-4 Hz and 4-8 Hz is dominated by speech-to-brain directional coupling during both reading aloud and listening, i.e., the cortical tracking of speech during reading aloud mainly entails auditory feedback processing. Nevertheless, brain-to-speech directional coupling at 4-8 Hz was enhanced during reading aloud compared with listening, likely reflecting the establishment of predictions about the auditory consequences of speech before production. These data bring novel insights into how auditory verbal information is tracked by the human brain during perception and self-generation of connected speech.

[1]  V. Jousmäki,et al.  The pace of prosodic phrasing couples the listener's cortex to the reader's voice , 2013, Human brain mapping.

[2]  Bijan Pesaran,et al.  Sensory-motor transformations for speech occur bilaterally , 2014, Nature.

[3]  P. Schyns,et al.  Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain , 2013, PLoS biology.

[4]  Jan Kujala,et al.  The right hemisphere is highlighted in connected natural speech production and perception , 2017, NeuroImage.

[5]  Gareth R. Barnes,et al.  Frequency-dependent functional connectivity within resting-state networks: An atlas-based MEG beamformer solution , 2012, NeuroImage.

[6]  Pavel Sovka,et al.  Approximation of the null distribution of the multiple coherence estimated with segment overlapping , 2014, Signal Process..

[7]  Marcel Brass,et al.  Conflict monitoring in speech processing: An fMRI study of error detection in speech production and perception , 2016, NeuroImage.

[8]  Riitta Hari,et al.  Cortical Tracking of Speech-in-Noise Develops from Childhood to Adulthood , 2019, The Journal of Neuroscience.

[9]  Mathieu Bourguignon,et al.  Comparing the potential of MEG and EEG to uncover brain tracking of speech temporal envelope , 2019, NeuroImage.

[10]  Arnold Neumaier,et al.  Algorithm 808: ARfit—a matlab package for the estimation of parameters and eigenmodes of multivariate autoregressive models , 2001, TOMS.

[11]  Fernando Maestú,et al.  Choice of Magnetometers and Gradiometers after Signal Space Separation , 2017, Sensors.

[12]  M. Eichler,et al.  Assessing the strength of directed influences among neural signals using renormalized partial directed coherence , 2009, Journal of Neuroscience Methods.

[13]  A M Amjad,et al.  A framework for the analysis of mixed time series/point process data--theory and application to the study of physiological tremor, single motor unit discharges and electromyograms. , 1995, Progress in biophysics and molecular biology.

[14]  E. Oja,et al.  Independent Component Analysis , 2013 .

[15]  Michael I. Jordan,et al.  Sensorimotor adaptation in speech production. , 1998, Science.

[16]  Edward F Chang,et al.  The cortical computations underlying feedback control in vocal production , 2015, Current Opinion in Neurobiology.

[17]  Martin Luessi,et al.  MNE software for processing MEG and EEG data , 2014, NeuroImage.

[18]  Luca Faes,et al.  Testing Frequency-Domain Causality in Multivariate Time Series , 2010, IEEE Transactions on Biomedical Engineering.

[19]  Manuel Carreiras,et al.  Out‐of‐synchrony speech entrainment in developmental dyslexia , 2016, Human brain mapping.

[20]  J. Ford,et al.  Fine-tuning of auditory cortex during speech production. , 2005, Psychophysiology.

[21]  J Gross,et al.  REPRINTS , 1962, The Lancet.

[22]  Karl J. Friston,et al.  Incorporating Prior Knowledge into Image Registration , 1997, NeuroImage.

[23]  M. Op de Beeck,et al.  Effect of movement rate on corticokinematic coherence , 2015, Neurophysiologie Clinique/Clinical Neurophysiology.

[24]  J. Ashburner,et al.  Nonlinear spatial normalization using basis functions , 1999, Human brain mapping.

[25]  H. Hotelling The Generalization of Student’s Ratio , 1931 .

[26]  Mathieu Bourguignon,et al.  Neuronal network coherent with hand kinematics during fast repetitive hand movements , 2012, NeuroImage.

[27]  C. Larson,et al.  Voice F0 responses to manipulations in pitch feedback. , 1998, The Journal of the Acoustical Society of America.

[28]  R. Ilmoniemi,et al.  Interpreting magnetic fields of the brain: minimum norm estimates , 2006, Medical and Biological Engineering and Computing.

[29]  Peter Indefrey,et al.  The Spatial and Temporal Signatures of Word Production Components: A Critical Update , 2011, Front. Psychology.

[30]  Simone Sulpizio,et al.  Editorial: Bridging Reading Aloud and Speech Production , 2016, Front. Psychol..

[31]  Riitta Hari,et al.  Functional motor-cortex mapping using corticokinematic coherence , 2011, NeuroImage.

[32]  Riitta Hari,et al.  Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene , 2016, The Journal of Neuroscience.

[33]  Gabriel Curio,et al.  Differential effects of overt, covert and replayed speech on vowel-evoked responses of the human auditory cortex , 1999, Neuroscience Letters.

[34]  S. Taulu,et al.  Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements , 2006, Physics in medicine and biology.

[35]  Jan Kujala,et al.  Cortical Tracking of Global and Local Variations of Speech Rhythm during Connected Natural Speech Perception , 2018, Journal of Cognitive Neuroscience.

[36]  Jonathan Z. Simon,et al.  Adaptive Temporal Encoding Leads to a Background-Insensitive Cortical Representation of Speech , 2013, The Journal of Neuroscience.

[37]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[38]  Nicola Molinaro,et al.  Lip-Reading Enables the Brain to Synthesize Auditory Features of Unknown Silent Speech , 2019, The Journal of Neuroscience.

[39]  Mathieu Bourguignon,et al.  Coupling between human brain activity and body movements: Insights from non-invasive electromagnetic recordings , 2019, NeuroImage.

[40]  M. Merzenich,et al.  Modulation of the Auditory Cortex during Speech: An MEG Study , 2002, Journal of Cognitive Neuroscience.

[41]  Jeffery A. Jones,et al.  Top-Down Modulation of Auditory-Motor Integration during Speech Production: The Role of Working Memory , 2017, The Journal of Neuroscience.

[42]  S. Nagarajan,et al.  Magnetoencephalographic evidence for a precise forward model in speech production , 2006, Neuroreport.

[43]  D. Poeppel,et al.  Phase Patterns of Neuronal Responses Reliably Discriminate Speech in Human Auditory Cortex , 2007, Neuron.

[44]  D. Ostry,et al.  Somatosensory basis of speech production , 2003, Nature.

[45]  Hyojin Park,et al.  Predictive entrainment of natural speech through two fronto-motor top-down channels , 2018, bioRxiv.

[46]  G. Hickok Computational neuroanatomy of speech production , 2012, Nature Reviews Neuroscience.

[47]  D. Poeppel,et al.  Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party” , 2013, Neuron.

[48]  Wens Vincent,et al.  Cortical kinematic processing of executed and observed goal-directed hand actions , 2014 .

[49]  David Poeppel,et al.  The analysis of speech in different temporal integration windows: cerebral lateralization as 'asymmetric sampling in time' , 2003, Speech Commun..

[50]  Elizabeth R. Blacfkmer,et al.  Theories of monitoring and the timing of repairs in spontaneous speech , 1991, Cognition.

[51]  Joachim Gross,et al.  Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features , 2018, PLoS biology.

[52]  Judit Bóna,et al.  Temporal characteristics of speech: the effect of age and speech style. , 2014, The Journal of the Acoustical Society of America.

[53]  Riitta Hari,et al.  Coherence between magnetoencephalography and hand-action-related acceleration, force, pressure, and electromyogram , 2013, NeuroImage.

[54]  S. Muthukumaraswamy High-frequency brain activity and muscle artifacts in MEG/EEG: a review and recommendations , 2013, Front. Hum. Neurosci..

[55]  Nicola Molinaro,et al.  Delta(but not theta)‐band cortical entrainment involves speech‐specific processing , 2018, The European journal of neuroscience.

[56]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[57]  Riitta Salmelin,et al.  Corticomuscular Coherence Is Tuned to the Spontaneous Rhythmicity of Speech at 2–3 Hz , 2012, The Journal of Neuroscience.

[58]  Robert Leech,et al.  Sensory-Motor Integration during Speech Production Localizes to Both Left and Right Plana Temporale , 2014, The Journal of Neuroscience.

[59]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[60]  Riitta Hari,et al.  MEG-compatible pneumatic stimulator to elicit passive finger and toe movements , 2015, NeuroImage.

[61]  V. Gracco,et al.  Perceptual recalibration of speech sounds following speech motor learning. , 2009, The Journal of the Acoustical Society of America.

[62]  Jay J Bauer,et al.  Vocal responses to unanticipated perturbations in voice loudness feedback: an automatic mechanism for stabilizing voice amplitude. , 2006, The Journal of the Acoustical Society of America.

[63]  Krish D. Singh,et al.  A new approach to neuroimaging with magnetoencephalography , 2005, Human brain mapping.

[64]  Riitta Salmelin,et al.  Subject's own speech reduces reactivity of the human auditory cortex , 1999, Neuroscience Letters.

[65]  Erkki Oja,et al.  Independent component approach to the analysis of EEG and MEG recordings , 2000, IEEE Transactions on Biomedical Engineering.

[66]  Michael Eichler,et al.  Abstract Journal of Neuroscience Methods xxx (2005) xxx–xxx Testing for directed influences among neural signals using partial directed coherence , 2005 .

[67]  S. Taulu,et al.  Applications of the signal space separation method , 2005, IEEE Transactions on Signal Processing.

[68]  Nicola Molinaro,et al.  Contrasting functional imaging parametric maps: The mislocation problem and alternative solutions , 2018, NeuroImage.

[69]  Thomas E. Nichols,et al.  Nonparametric permutation tests for functional neuroimaging: A primer with examples , 2002, Human brain mapping.

[70]  Riitta Hari,et al.  Corticokinematic coherence mainly reflects movement-induced proprioceptive feedback , 2015, NeuroImage.

[71]  Asif A. Ghazanfar,et al.  The Natural Statistics of Audiovisual Speech , 2009, PLoS Comput. Biol..

[72]  Mathieu Bourguignon,et al.  Preserved Coupling between the Reader's Voice and the Listener's Cortical Activity in Autism Spectrum Disorders , 2014, PloS one.

[73]  Karl J. Friston,et al.  Active inference, communication and hermeneutics , 2015, Cortex.

[74]  D. Poeppel,et al.  Cortical Tracking of Hierarchical Linguistic Structures in Connected Speech , 2015, Nature Neuroscience.

[75]  Sylvain Baillet,et al.  Two Distinct Neural Timescales for Predictive Speech Processing , 2019, Neuron.

[76]  Baofeng Zhang,et al.  Auditory-Motor Control of Vocal Production during Divided Attention: Behavioral and ERP Correlates , 2018, Front. Neurosci..

[77]  Christopher T. Kello,et al.  The segment as the minimal planning unit in speech production and reading aloud: evidence and implications , 2015, Front. Psychol..

[78]  M. Bourguignon,et al.  Corticokinematic coherence during active and passive finger movements , 2013, Neuroscience.

[79]  Riitta Salmelin,et al.  Cortical entrainment: what we can learn from studying naturalistic speech perception , 2018, Language, Cognition and Neuroscience.

[80]  関原 謙介,et al.  Adaptive Spatial Filters for Electromagnetic Brain Imaging , 2008 .

[81]  Joachim Gross,et al.  Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension , 2012, Cerebral cortex.

[82]  D. Poeppel,et al.  Temporal context in speech processing and attentional stream selection: A behavioral and neural perspective , 2012, Brain and Language.

[83]  G. Curio,et al.  Speaking modifies voice‐evoked activity in the human auditory cortex , 2000, Human brain mapping.

[84]  G. Dell,et al.  Is comprehension necessary for error detection? A conflict-based account of monitoring in speech production , 2011, Cognitive Psychology.

[85]  Bruce Fischl,et al.  Within-subject template estimation for unbiased longitudinal image analysis , 2012, NeuroImage.

[86]  Gregor Thut,et al.  Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility , 2016, eLife.

[87]  Luca Faes,et al.  Surrogate data analysis for assessing the significance of the coherence function , 2004, IEEE Transactions on Biomedical Engineering.

[88]  Mathieu Bourguignon,et al.  A geometric correction scheme for spatial leakage effects in MEG/EEG seed‐based functional connectivity mapping , 2015, Human brain mapping.