Dynamic Estimation of the Auditory Temporal Response Function From MEG in Competing-Speaker Environments

Objective: A central problem in computational neuroscience is to characterize brain function using neural activity recorded from the brain in response to sensory inputs with statistical confidence. Most of existing estimation techniques, such as those based on reverse correlation, exhibit two main limitations: first, they are unable to produce dynamic estimates of the neural activity at a resolution comparable with that of the recorded data, and second, they often require heavy averaging across time as well as multiple trials in order to construct statistical confidence intervals for a precise interpretation of data. In this paper, we address the above-mentioned issues for estimating auditory temporal response function (TRF) as a parametric computational model for selective auditory attention in competing-speaker environments. Methods: The TRF is a sparse kernel which regresses auditory MEG data with respect to the envelopes of the speech streams. We develop an efficient estimation technique by exploiting the sparsity of the TRF and adopting an $\ell _1$ -regularized least squares estimator which is capable of producing dynamic TRF estimates as well as confidence intervals at sampling resolution from single-trial MEG data. Results: We evaluate the performance of our proposed estimator using evoked MEG responses from the human brain in an auditory attention experiment with two competing speakers. The TRFs are estimated dynamically over time using the proposed technique with multisecond resolution, which is a significant improvement over previous results with a temporal resolution of the order of a minute. Conclusion: Application of our method to MEG data reveals a precise characterization of the modulation of M50 and M100 evoked responses with respect to the attentional state of the subject at multisecond resolution. Significance: Our proposed estimation technique provides a high resolution real-time attention decoding framework in multispeaker environments with potential application in smart hearing aid technology.

[1]  W. Roberts,et al.  Prominence of M50 auditory evoked response over M100 in childhood and autism , 2004, Neuroreport.

[2]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[3]  Christoph E. Schreiner,et al.  Auditory Cortex Mapmaking: Principles, Projections, and Plasticity , 2007, Neuron.

[4]  J. Fritz,et al.  Active listening: Task-dependent plasticity of spectrotemporal receptive fields in primary auditory cortex , 2005, Hearing Research.

[5]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[6]  J. Simon,et al.  Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. , 2012, Journal of neurophysiology.

[7]  Vahid Tarokh,et al.  SPARLS: The Sparse RLS Algorithm , 2010, IEEE Transactions on Signal Processing.

[8]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[9]  Bin Yu,et al.  Boosting with early stopping: Convergence and consistency , 2005, math/0508276.

[10]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[11]  D. Poeppel,et al.  Neural dynamics of attending and ignoring in human auditory cortex , 2010, Neuropsychologia.

[12]  Trevor Hastie,et al.  Discussion of Boosting Papers , 2003 .

[13]  Brigitte Röder,et al.  Early processing stages are modulated when auditory stimuli are presented at an attended moment in time: an event-related potential study. , 2003, Psychophysiology.

[14]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[15]  Vincent L. Gracco,et al.  Speech-induced suppression of evoked auditory fields in children who stutter , 2011, NeuroImage.

[16]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[17]  David Poeppel,et al.  Auditory M50 and M100 responses to broadband noise: functional implications , 2004, Neuroreport.

[18]  Shihab A. Shamma,et al.  Recursive Sparse Point Process Regression With Application to Spectrotemporal Receptive Field Plasticity Analysis , 2015, IEEE Transactions on Signal Processing.

[19]  Jonathan Z. Simon,et al.  Denoising based on spatial filtering , 2008, Journal of Neuroscience Methods.

[20]  Jonathan Z. Simon,et al.  Abstract Journal of Neuroscience Methods 165 (2007) 297–305 Denoising based on time-shift PCA , 2007 .

[21]  T. Strohmer,et al.  Gabor Analysis and Algorithms: Theory and Applications , 1997 .

[22]  J. Fritz,et al.  Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex , 2003, Nature Neuroscience.

[23]  N. Mesgarani,et al.  Selective cortical representation of attended speaker in multi-talker speech perception , 2012, Nature.

[24]  Mikko Sams,et al.  Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise , 2011, Proceedings of the National Academy of Sciences.

[25]  Christian K. Machens,et al.  Linearity of Cortical Receptive Fields Measured with Natural Sounds , 2004, The Journal of Neuroscience.

[27]  N. C. Singh,et al.  Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli , 2001 .

[28]  John J. Foxe,et al.  Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. , 2015, Cerebral cortex.

[29]  J. Simon,et al.  Emergence of neural encoding of auditory objects while listening to competing speakers , 2012, Proceedings of the National Academy of Sciences.

[30]  Mounya Elhilali,et al.  Task Difficulty and Performance Induce Diverse Adaptive Patterns in Gain and Shape of Primary Auditory Cortical Receptive Fields , 2009, Neuron.

[31]  S. David,et al.  Estimating sparse spectro-temporal receptive fields with natural stimuli , 2007, Network.

[32]  Alessandro Presacco,et al.  Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling , 2016, NeuroImage.

[33]  Ben Willmore,et al.  The Receptive-Field Organization of Simple Cells in Primary Visual Cortex of Ferrets under Natural Scene Stimulation , 2003, The Journal of Neuroscience.

[34]  Jonathan Z. Simon,et al.  A State-Space Model for Decoding Auditory Attentional Modulation from MEG in a Competing-Speaker Environment , 2014, NIPS.

[35]  A. Haar Zur Theorie der orthogonalen Funktionensysteme , 1910 .