Enhanced neural tracking of the fundamental frequency of the voice

Objective ‘F0 tracking’ is a novel method that investigates the neural processing of the fundamental frequency of the voice (f0) in continuous speech. Through linear modelling, a feature that reflects the stimulus f0 is predicted from the EEG data. Then, the neural response strength is evaluated through the correlation between the predicted and actual f0 feature. The aim of this study was to improve upon this ‘f0 tracking’ method by optimizing the f0 feature. Approach Specifically, we aimed to design a feature that approximates the expected EEG responses to the f0. We hypothesized that this would improve neural tracking results, because the more similar the feature and the neural response are, the easier it will be to reconstruct the one from the other. Two techniques were explored: a phenomenological model to simulate neural processing in the auditory periphery and a low-pass filter to approximate the effect of more central processing on the f0 response. Since these optimizations target different aspects of the auditory system, they were also applied in a cumulative fashion. Results Results obtained from EEG evoked by a Flemish story in 34 subjects indicated that both the use of the auditory model and the addition of the low-pass filter significantly improved the correlations between the actual and reconstructed feature. The combination of both strategies almost doubled the mean correlation over subjects, from 0.78 to 0.13. Moreover, canonical correlation analysis with the modelled feature revealed two distinct processes contributing to the f0 response: one driven by the compound activity of auditory nerve fibers with center frequency up to 8 kHz and one driven predominantly by the auditory nerve fibers with center frequency below 1 kHz. Significance The optimized f0 features developed in this study enhance the analysis of f0-tracking responses and facilitate future research and applications.

[1]  S. David,et al.  Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. , 2009, Journal of neurophysiology.

[2]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[3]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[4]  Terence W. Picton,et al.  Envelope Following Responses to Natural Vowels , 2006, Audiology and Neurotology.

[5]  Jan Wouters,et al.  APEX 3: a multi-purpose test platform for auditory psychophysical experiments , 2008, Journal of Neuroscience Methods.

[6]  Laurel H Carney,et al.  A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. , 2009, The Journal of the Acoustical Society of America.

[7]  Contribution of Resolved and Unresolved Harmonic Regions to Brainstem Speech-Evoked Responses in Quiet and in Background Noise , 2011, Audiology research.

[8]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[9]  John J. Foxe,et al.  Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution , 2010, The European journal of neuroscience.

[10]  André M Marcoux,et al.  Brainstem Auditory Responses to Resolved and Unresolved Harmonics of a Synthetic Vowel in Quiet and Noise , 2013, Ear and hearing.

[11]  Alexander Bertrand,et al.  A generic EEG artifact removal algorithm based on the multi-channel Wiener filter , 2018, Journal of neural engineering.

[12]  A. Oxenham,et al.  Sequential F0 comparisons between resolved and unresolved harmonics: no evidence for translation noise between two pitch mechanisms. , 2004, The Journal of the Acoustical Society of America.

[13]  Liberty S. Hamilton,et al.  The revolution will not be controlled: natural stimuli in speech neuroscience , 2018, Language, cognition and neuroscience.

[14]  Muhammad S A Zilany,et al.  Modeling auditory-nerve responses for high sound pressure levels in the normal and impaired auditory periphery. , 2006, The Journal of the Acoustical Society of America.

[15]  Fuh-Cherng Jeng,et al.  Relative Power of Harmonics in Human Frequency-Following Responses Associated with Voice Pitch in American and Chinese Adults , 2011, Perceptual and motor skills.

[16]  L. Carney,et al.  A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression. , 2001, The Journal of the Acoustical Society of America.

[17]  Erika Skoe,et al.  Frequency-dependent fine structure in the frequency-following response: The byproduct of multiple generators , 2017, Hearing Research.

[18]  Christian K. Machens,et al.  Linearity of Cortical Receptive Fields Measured with Natural Sounds , 2004, The Journal of Neuroscience.

[19]  Muhammad S A Zilany,et al.  Representation of the vowel /epsilon/ in normal and impaired auditory nerve fibers: model predictions of responses in cats. , 2007, The Journal of the Acoustical Society of America.

[20]  Tobias Reichenbach,et al.  The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention , 2017 .

[21]  Guideline 5: Guidelines for Standard Electrode Position Nomenclature , 2006, American journal of electroneurodiagnostic technology.

[22]  J. Wouters,et al.  The effect of stimulus envelope shape on the auditory steady-state response , 2019, Hearing Research.

[23]  L. Carney,et al.  A model for the responses of low-frequency auditory-nerve fibers in cat. , 1993, The Journal of the Acoustical Society of America.

[24]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[25]  Ian C. Bruce,et al.  Representation of the vowel /ε/ in normal and impaired auditory nerve fibers: Model predictions of responses in cats , 2007 .

[26]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[27]  Jan Wouters,et al.  Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope , 2018, bioRxiv.

[28]  Terence W Picton,et al.  Human temporal auditory acuity as assessed by envelope following responses. , 2004, The Journal of the Acoustical Society of America.

[29]  T. Reichenbach,et al.  Computational modeling of the auditory brainstem response to continuous speech , 2020, Journal of neural engineering.

[30]  Ian C. Bruce,et al.  A phenomenological model of the synapse between the inner hair cell and auditory nerve: Implications of limited neurotransmitter release sites , 2017, Hearing Research.

[31]  Mikolaj Kegler,et al.  Decoding of selective attention to continuous speech from the human auditory brainstem response , 2019, NeuroImage.

[32]  Edmund C. Lalor,et al.  The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli , 2016, Front. Hum. Neurosci..

[33]  A. Krishnan,et al.  Neural encoding in the human brainstem relevant to the pitch of complex tones , 2011, Hearing Research.

[34]  L. Carney,et al.  A phenomenological model of peripheral and central neural responses to amplitude-modulated tones. , 2004, The Journal of the Acoustical Society of America.

[35]  T. Dau The importance of cochlear processing for the formation of auditory brainstem and frequency following responses. , 2003, The Journal of the Acoustical Society of America.

[36]  Sarah Verhulst,et al.  Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss , 2017, Hearing Research.

[37]  J. Wouters,et al.  Neural tracking of the fundamental frequency of the voice: The effect of voice characteristics , 2021, The European journal of neuroscience.

[38]  M. Sachs,et al.  An auditory-periphery model of the effects of acoustic trauma on auditory nerve responses. , 2003, The Journal of the Acoustical Society of America.

[39]  Hugo Van hamme,et al.  An LSTM Based Architecture to Relate Speech Stimulus to Eeg , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  David S. Lorberbaum,et al.  Genetic evidence that Nkx2.2 acts primarily downstream of Neurog3 in pancreatic endocrine lineage development , 2017, eLife.

[41]  Stefan Haufe,et al.  On the interpretation of weight vectors of linear models in multivariate neuroimaging , 2014, NeuroImage.

[42]  Laurel H. Carney,et al.  Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations1,2,3 , 2015, eNeuro.

[43]  Laurel H Carney,et al.  Updated parameters and expanded simulation options for a model of the auditory periphery. , 2014, The Journal of the Acoustical Society of America.

[44]  J. Wouters,et al.  From modulated noise to natural speech: The effect of stimulus parameters on the envelope following response , 2020, Hearing Research.