Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex

A central goal of sensory neuroscience is to construct models that can explain neural responses to natural stimuli. As a consequence, sensory models are often tested by comparing neural responses to natural stimuli with model responses to those stimuli. One challenge is that distinct model features are often correlated across natural stimuli, and thus model features can predict neural responses even if they do not in fact drive them. Here, we propose a simple alternative for testing a sensory model: we synthesize a stimulus that yields the same model response as each of a set of natural stimuli, and test whether the natural and “model-matched” stimuli elicit the same neural responses. We used this approach to test whether a common model of auditory cortex—in which spectrogram-like peripheral input is processed by linear spectrotemporal filters—can explain fMRI responses in humans to natural sounds. Prior studies have that shown that this model has good predictive power throughout auditory cortex, but this finding could reflect feature correlations in natural stimuli. We observed that fMRI responses to natural and model-matched stimuli were nearly equivalent in primary auditory cortex (PAC) but that nonprimary regions, including those selective for music or speech, showed highly divergent responses to the two sound sets. This dissociation between primary and nonprimary regions was less clear from model predictions due to the influence of feature correlations across natural stimuli. Our results provide a signature of hierarchical organization in human auditory cortex, and suggest that nonprimary regions compute higher-order stimulus properties that are not well captured by traditional models. Our methodology enables stronger tests of sensory models and could be broadly applied in other domains.

[1]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[2]  Frédéric E. Theunissen,et al.  The Modulation Transfer Function for Speech Intelligibility , 2009, PLoS Comput. Biol..

[3]  B. Kollmeier,et al.  Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. , 1997, The Journal of the Acoustical Society of America.

[4]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[5]  Jesper Andersson,et al.  A multi-modal parcellation of human cerebral cortex , 2016, Nature.

[6]  William Bialek,et al.  Analyzing Neural Responses to Natural Signals: Maximally Informative Dimensions , 2002, Neural Computation.

[7]  Julie E. Elie,et al.  Neural processing of natural sounds , 2014, Nature Reviews Neuroscience.

[8]  Timothy Q Gentner,et al.  Central auditory neurons have composite receptive fields , 2016, Proceedings of the National Academy of Sciences.

[9]  M. Merzenich,et al.  Optimizing sound features for cortical neurons. , 1998, Science.

[10]  Noël Staeren,et al.  Sound Categories Are Represented as Distributed Patterns in the Human Auditory Cortex , 2009, Current Biology.

[11]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[12]  Jonathan Winawer,et al.  GLMdenoise: a fast, automated technique for denoising task-based fMRI data , 2013, Front. Neurosci..

[13]  B. Willmore,et al.  Incorporating Midbrain Adaptation to Mean Sound Level Improves Models of Auditory Cortical Processing , 2016, The Journal of Neuroscience.

[14]  Josh H. McDermott,et al.  Adaptive and Selective Time Averaging of Auditory Scenes , 2018, Current Biology.

[15]  G. Recanzone,et al.  Serial and parallel processing in the primate auditory cortex revisited , 2010, Behavioural Brain Research.

[16]  R. Rosenholtz,et al.  A summary statistic representation in peripheral vision explains visual search. , 2009, Journal of vision.

[17]  Marcel A J van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2015, The Journal of Neuroscience.

[18]  M. Schönwiesner,et al.  Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI , 2009, Proceedings of the National Academy of Sciences.

[19]  Arafat Angulo-Perkins,et al.  Music listening engages specific cortical regions within the temporal lobes: Differences between musicians and non-musicians , 2014, Cortex.

[20]  Essa Yacoub,et al.  Encoding of Natural Sounds at Multiple Spectral and Temporal Resolutions in the Human Auditory Cortex , 2014, PLoS Comput. Biol..

[21]  Benjamin J. Balas,et al.  Texture synthesis and perception: Using computational models to study texture representations in the human visual system , 2006, Vision Research.

[22]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[23]  Eero P. Simoncelli,et al.  Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .

[24]  Essa Yacoub,et al.  Reconstructing the spectrotemporal modulations of real-life sounds from fMRI response patterns , 2017, Proceedings of the National Academy of Sciences.

[25]  Richard S. J. Frackowiak,et al.  Human Primary Auditory Cortex Follows the Shape of Heschl's Gyrus , 2011, The Journal of Neuroscience.

[26]  Maneesh Sahani,et al.  How Linear are Auditory Cortical Responses? , 2002, NIPS.

[27]  J. Rauschecker,et al.  Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing , 2009, Nature Neuroscience.

[28]  J. Rauschecker,et al.  Cortical Representation of Natural Complex Sounds: Effects of Acoustic Features and Auditory Object Category , 2010, The Journal of Neuroscience.

[29]  E. T. Possing,et al.  Human temporal lobe activation by speech and nonspeech sounds. , 2000, Cerebral cortex.

[30]  Brian N. Pasley,et al.  Reconstructing Speech from Human Auditory Cortex , 2012, PLoS biology.

[31]  Gregory Hickok,et al.  Orthogonal acoustic dimensions define auditory field maps in human cortex , 2012, Proceedings of the National Academy of Sciences.

[32]  Aniruddh D. Patel,et al.  Temporal modulations in speech and music , 2017, Neuroscience & Biobehavioral Reviews.

[33]  Maneesh Sahani,et al.  Models of Neuronal Stimulus-Response Functions: Elaboration, Estimation, and Evaluation , 2017, Front. Syst. Neurosci..

[34]  Mounya Elhilali,et al.  Music in Our Ears: The Biological Bases of Musical Timbre Perception , 2012, PLoS Comput. Biol..

[35]  Nikolaus Kriegeskorte,et al.  Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.

[36]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[37]  Stephen V. David,et al.  The Essential Complexity of Auditory Receptive Fields , 2015, PLoS Comput. Biol..

[38]  Srivatsun Sadagopan,et al.  Nonlinear Spectrotemporal Interactions Underlying Selectivity for Complex Sounds in Auditory Cortex , 2009, The Journal of Neuroscience.

[39]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[40]  S. Lomber,et al.  Evidence for Hierarchical Processing in Cat Auditory Cortex: Nonreciprocal Influence of Primary Auditory Cortex on the Posterior Auditory Field , 2009, The Journal of Neuroscience.

[41]  S. David,et al.  Integration over Multiple Timescales in Primary Auditory Cortex , 2013, The Journal of Neuroscience.

[42]  Jonathan H. Venezia,et al.  Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. , 2010, Cerebral cortex.

[43]  R. Bowtell,et al.  “sparse” temporal sampling in auditory fMRI , 1999, Human brain mapping.

[44]  Thomas L. Griffiths,et al.  Supplementary Information for Natural Speech Reveals the Semantic Maps That Tile Human Cerebral Cortex , 2022 .

[45]  D. Poeppel,et al.  The cortical organization of speech processing , 2007, Nature Reviews Neuroscience.

[46]  Josh H. McDermott,et al.  Distinct Cortical Pathways for Music and Speech Revealed by Hypothesis-Free Voxel Decomposition , 2015, Neuron.

[47]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[48]  Andrew J. King,et al.  Measuring the Performance of Neural Models , 2016, Front. Comput. Neurosci..

[49]  J. Rauschecker,et al.  Processing of complex sounds in the macaque nonprimary auditory cortex. , 1995, Science.

[50]  Josh H. McDermott,et al.  Distortion products in auditory fMRI research: Measurements and solutions , 2016, NeuroImage.

[51]  Edmund C. Lalor,et al.  Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing , 2015, Current Biology.

[52]  Alain de Cheveigné,et al.  Decoding the auditory brain with canonical component analysis , 2017, NeuroImage.

[53]  P. Morosan,et al.  Human Primary Auditory Cortex: Cytoarchitectonic Subdivisions and Mapping into a Spatial Reference System , 2001, NeuroImage.

[54]  Christoph E Schreiner,et al.  Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli , 2016, The Journal of Neuroscience.

[55]  Young-Ho Lee,et al.  Vortex flow patterns of a heaving foil , 2006, J. Vis..

[56]  Josh H. McDermott,et al.  Cortical Pitch Regions in Humans Respond Primarily to Resolved Harmonics and Are Located in Specific Tonotopic Regions of Anterior Auditory Cortex , 2013, The Journal of Neuroscience.

[57]  J. Rauschecker,et al.  Hierarchical Organization of the Human Auditory Cortex Revealed by Functional Magnetic Resonance Imaging , 2001, Journal of Cognitive Neuroscience.

[58]  David Poeppel,et al.  The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts , 2015, Nature Neuroscience.

[59]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[60]  Nancy Kanwisher,et al.  Divide and conquer: A defense of functional localizers , 2006, NeuroImage.

[61]  Stephen M. Smith,et al.  Temporal Autocorrelation in Univariate Linear Modeling of FMRI Data , 2001, NeuroImage.

[62]  Michael Eickenberg,et al.  Seeing it all: Convolutional network layers map the function of the human visual system , 2017, NeuroImage.

[63]  Katrin Krumbholz,et al.  Parcellation of Human and Monkey Core Auditory Cortex with fMRI Pattern Classification and Objective Detection of Tonotopic Gradient Reversals , 2014, Cerebral cortex.

[64]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[65]  Powen Ru,et al.  Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.

[66]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[67]  Frédéric E Theunissen,et al.  The Hierarchical Cortical Organization of Human Speech Processing , 2017, The Journal of Neuroscience.

[68]  Leon A. Gatys,et al.  Deep convolutional models improve predictions of macaque V1 responses to natural images , 2019, PLoS Comput. Biol..

[69]  Mitchell Steinschneider,et al.  Temporally dynamic frequency tuning of population responses in monkey primary auditory cortex , 2009, Hearing Research.

[70]  Steven Greenberg,et al.  Temporal properties of spontaneous speech - a syllable-centric perspective , 2003, J. Phonetics.

[71]  Colin Humphries,et al.  Tonotopic organization of human auditory cortex , 2010, NeuroImage.

[72]  K. Sen,et al.  Spectral-temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds , 2022 .

[73]  F. Dick,et al.  In Vivo Functional and Myeloarchitectonic Mapping of Human Primary Auditory Areas , 2012, The Journal of Neuroscience.

[74]  L. Carney,et al.  A phenomenological model of peripheral and central neural responses to amplitude-modulated tones. , 2004, The Journal of the Acoustical Society of America.

[75]  J. Kaas,et al.  Subdivisions of auditory cortex and processing streams in primates. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[76]  Jack L. Gallant,et al.  A Continuous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain , 2012, Neuron.

[77]  R. Rosenholtz,et al.  A summary-statistic representation in peripheral vision explains visual crowding. , 2009, Journal of vision.

[78]  Lee M. Miller,et al.  Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex. , 2002, Journal of neurophysiology.

[79]  Josef P. Rauschecker,et al.  Functional Topography of Human Auditory Cortex , 2016, The Journal of Neuroscience.

[80]  Eero P. Simoncelli,et al.  A functional and perceptual signature of the second visual area in primates , 2013, Nature Neuroscience.

[81]  Nima Mesgarani,et al.  Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[82]  Klaus Scheffler,et al.  Spatial representations of temporal and spectral sound cues in human auditory cortex , 2013, Cortex.

[83]  Eero P. Simoncelli,et al.  Sound texture synthesis via filter statistics , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[84]  Anders M. Dale,et al.  Cortical Surface-Based Analysis I. Segmentation and Surface Reconstruction , 1999, NeuroImage.

[85]  Barbara Tillmann,et al.  Pitch-Responsive Cortical Regions in Congenital Amusia , 2016, The Journal of Neuroscience.

[86]  S A Shamma,et al.  Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.

[87]  Eero P. Simoncelli,et al.  Summary statistics in auditory perception , 2013, Nature Neuroscience.

[88]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[89]  Li Fei-Fei,et al.  Distinct contributions of functional and deep neural network features to representational similarity of scenes in human brain and behavior , 2018, eLife.

[90]  Neil C. Rabinowitz,et al.  Spectrotemporal Contrast Kernels for Neurons in Primary Auditory Cortex , 2012, The Journal of Neuroscience.

[91]  Elia Formisano,et al.  Processing of Natural Sounds in Human Auditory Cortex: Tonotopy, Spectral Tuning, and Relation to Voice Sensitivity , 2012, The Journal of Neuroscience.

[92]  Timothy D. Griffiths,et al.  A unified framework for the organization of the primate auditory cortex , 2013, Front. Syst. Neurosci..

[93]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[94]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[95]  Anne Hsu,et al.  Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds , 2005, Nature Neuroscience.

[96]  Stephen M. Smith,et al.  A global optimisation method for robust affine registration of brain images , 2001, Medical Image Anal..

[97]  C. Atencio,et al.  Cooperative Nonlinearities in Auditory Cortical Neurons , 2008, Neuron.

[98]  Po-Hsuan Chen,et al.  A Reduced-Dimension fMRI Shared Response Model , 2015, NIPS.

[99]  Bruce Fischl,et al.  Accurate and robust brain image alignment using boundary-based registration , 2009, NeuroImage.

[100]  James R. Bergen,et al.  Pyramid-based texture analysis/synthesis , 1995, Proceedings., International Conference on Image Processing.