Cortical Measures of Phoneme-Level Speech Encoding Correlate with the Perceived Clarity of Natural Speech

Abstract In real-world environments, humans comprehend speech by actively integrating prior knowledge (P) and expectations with sensory input. Recent studies have revealed effects of prior information in temporal and frontal cortical areas and have suggested that these effects are underpinned by enhanced encoding of speech-specific features, rather than a broad enhancement or suppression of cortical activity. However, in terms of the specific hierarchical stages of processing involved in speech comprehension, the effects of integrating bottom-up sensory responses and top-down predictions are still unclear. In addition, it is unclear whether the predictability that comes with prior information may differentially affect speech encoding relative to the perceptual enhancement that comes with that prediction. One way to investigate these issues is through examining the impact of P on indices of cortical tracking of continuous speech features. Here, we did this by presenting participants with degraded speech sentences that either were or were not preceded by a clear recording of the same sentences while recording non-invasive electroencephalography (EEG). We assessed the impact of prior information on an isolated index of cortical tracking that reflected phoneme-level processing. Our findings suggest the possibility that prior information affects the early encoding of natural speech in a dual manner. Firstly, the availability of prior information, as hypothesized, enhanced the perceived clarity of degraded speech, which was positively correlated with changes in phoneme-level encoding across subjects. In addition, P induced an overall reduction of this cortical measure, which we interpret as resulting from the increase in predictability.

[1]  James L. McClelland Integrating probabilistic models of perception and interactive neural networks: a historical and tutorial review , 2013, Front. Psychol..

[2]  Matthew K. Leonard,et al.  Perceptual restoration of masked speech in human cortex , 2016, Nature Communications.

[3]  Matthew H. Davis,et al.  Hearing speech sounds: Top-down influences on the interface between audition and speech perception , 2007, Hearing Research.

[4]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[5]  E. Chang,et al.  Categorical Speech Representation in Human Superior Temporal Gyrus , 2010, Nature Neuroscience.

[6]  Ramesh Srinivasan,et al.  The effect of prior knowledge and intelligibility on the cortical entrainment response to speech. , 2017, Journal of neurophysiology.

[7]  Matthew H. Davis,et al.  Perceptual learning of degraded speech by minimizing prediction error , 2016, Proceedings of the National Academy of Sciences.

[8]  Uta Noppeney,et al.  When sentences live up to your expectations , 2016, NeuroImage.

[9]  E. Maris,et al.  Prior Expectation Mediates Neural Adaptation to Repeated Sounds in the Auditory Cortex: An MEG Study , 2011, The Journal of Neuroscience.

[10]  J. Rauschecker,et al.  Phoneme and word recognition in the auditory ventral stream , 2012, Proceedings of the National Academy of Sciences.

[11]  Edmund C. Lalor,et al.  The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli , 2016, Front. Hum. Neurosci..

[12]  Virginie van Wassenhove,et al.  Distinct contributions of low- and high-frequency neural oscillations to speech comprehension , 2017 .

[13]  Robert T. Knight,et al.  Rapid tuning shifts in human auditory cortex enhance speech intelligibility , 2016, Nature Communications.

[14]  Matthew H. Davis,et al.  Neural Oscillations Carry Speech Rhythm through to Comprehension , 2012, Front. Psychology.

[15]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[16]  James L. McClelland,et al.  An interactive Hebbian account of lexically guided tuning of speech perception , 2006, Psychonomic bulletin & review.

[17]  Matthew H. Davis,et al.  Predictive Top-Down Integration of Prior Knowledge during Speech Perception , 2012, The Journal of Neuroscience.

[18]  D. D. Greenwood,et al.  Auditory Masking and the Critical Band , 1961 .

[19]  J. Hohwy The Predictive Mind , 2013 .

[20]  Robin A. A. Ince,et al.  Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners , 2015, Current Biology.

[21]  Jim M. Monti,et al.  Neural repetition suppression reflects fulfilled perceptual expectations , 2008, Nature Neuroscience.

[22]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[23]  S. Shamma How phonetically selective is the human auditory cortex? , 2014, Trends in Cognitive Sciences.

[24]  Karl J. Friston,et al.  Dynamic causal modelling , 2003, NeuroImage.

[25]  Michael J. Burke,et al.  Averaging Correlations: Expected Values and Bias in Combined Pearson rs and Fisher's z Transformations , 1998 .

[26]  Edmund C. Lalor,et al.  Indexing cortical entrainment to natural speech at the phonemic level: Methodological considerations for applied research , 2017, Hearing Research.

[27]  K. Grill-Spector,et al.  Repetition and the brain: neural models of stimulus-specific effects , 2006, Trends in Cognitive Sciences.

[28]  Edmund C Lalor,et al.  Isolating Neural Indices of Continuous Speech Processing at the Phonetic Level. , 2016, Advances in experimental medicine and biology.

[29]  J. Simon,et al.  Cortical entrainment to continuous speech: functional roles and interpretations , 2014, Front. Hum. Neurosci..

[30]  Matthew H. Davis,et al.  Hierarchical Processing for Speech in Human Auditory Cortex and Beyond , 2010, Front. Hum. Neurosci..

[31]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[32]  Luc H. Arnal,et al.  Transitions in neural oscillations reflect prediction errors generated in audiovisual speech , 2011, Nature Neuroscience.

[33]  T. Picton,et al.  Human Cortical Responses to the Speech Envelope , 2008, Ear and hearing.

[34]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[35]  H. Nusbaum,et al.  Speech perception as an active cognitive process , 2014, Front. Syst. Neurosci..

[36]  D. Poeppel,et al.  The cortical organization of speech processing , 2007, Nature Reviews Neuroscience.

[37]  David Poeppel,et al.  The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts , 2015, Nature Neuroscience.

[38]  Jonathan Z. Simon,et al.  Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure , 2014, NeuroImage.

[39]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[40]  John J. Foxe,et al.  Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution , 2010, The European journal of neuroscience.

[41]  Luc H. Arnal,et al.  Cortical oscillations and sensory predictions , 2012, Trends in Cognitive Sciences.

[42]  Matthew H. Davis,et al.  Hierarchical Processing in Spoken Language Comprehension , 2003, The Journal of Neuroscience.

[43]  David Poeppel,et al.  Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. , 2010, Journal of neurophysiology.

[44]  Björn Herrmann,et al.  Neural Oscillations in Speech: Don't be Enslaved by the Envelope , 2012, Front. Hum. Neurosci..

[45]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[46]  Garreth Prendergast,et al.  The Role of Phase-locking to the Temporal Envelope of Speech in Auditory Perception and Speech Intelligibility , 2015, Journal of Cognitive Neuroscience.

[47]  Matthew K. Leonard,et al.  Dynamic speech representations in the human temporal lobe , 2014, Trends in Cognitive Sciences.

[48]  Kyle Gorman,et al.  Prosodylab-aligner: A tool for forced alignment of laboratory speech , 2011 .

[49]  Jonathan H. Venezia,et al.  Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. , 2010, Cerebral cortex.

[50]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[51]  Steven L. Small,et al.  The Neurobiology of Language , 2016 .

[52]  Matthew H. Davis,et al.  Prediction Errors but Not Sharpened Signals Simulate Multivoxel fMRI Patterns during Speech Perception , 2016, PLoS biology.

[53]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[54]  Joachim Gross,et al.  Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension , 2012, Cerebral cortex.