Efficient Neural Coding in Auditory and Speech Perception

Speech has long been recognized as 'special'. Here, we suggest that one of the reasons for speech being special is that our auditory system has evolved to encode it in an efficient, optimal way. The theory of efficient neural coding argues that our perceptual systems have evolved to encode environmental stimuli in the most efficient way. Mathematically, this can be achieved if the optimally efficient codes match the statistics of the signals they represent. Experimental evidence suggests that the auditory code is optimal in this mathematical sense: statistical properties of speech closely match response properties of the cochlea, the auditory nerve, and the auditory cortex. Even more interestingly, these results may be linked to phenomena in auditory and speech perception.

[1]  Lori L Holt,et al.  Efficient coding in human auditory perception. , 2009, The Journal of the Acoustical Society of America.

[2]  Emmanuel Ferragne,et al.  Adaptation to natural fast speech and time-compressed speech in children , 2013, INTERSPEECH.

[3]  S A Shamma,et al.  Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.

[4]  Lonneke B. M. Eeuwes,et al.  Efficient Encoding of Vocalizations in the Auditory Midbrain , 2010, The Journal of Neuroscience.

[5]  Ramon Guevara Erra,et al.  The Efficient Coding of Speech: Cross-Linguistic Differences , 2016, PloS one.

[6]  Robert C. Liu,et al.  Inhibitory Plasticity in a Lateral Band Improves Cortical Detection of Natural Vocalizations , 2009, Neuron.

[7]  Asif A Ghazanfar,et al.  Facilitation of multisensory integration by the "unity effect" reveals that speech is special. , 2008, Journal of vision.

[8]  Oded Ghitza,et al.  Linking Speech Perception and Neurophysiology: Speech Decoding Guided by Cascaded Oscillators Locked to the Input Rhythm , 2011, Front. Psychology.

[9]  A. Liberman On Finding That Speech Is Special , 1982 .

[10]  M. Oelschlaeger,et al.  Time-compressed speech discrimination in children and its relationship to articulation. , 1977, Journal of the American Audiology Society.

[11]  F. Theunissen,et al.  Meaning in the avian auditory cortex: neural representation of communication calls , 2015, The European journal of neuroscience.

[12]  Isaac M. Carruthers,et al.  Encoding of ultrasonic vocalizations in the auditory cortex , 2013, Journal of neurophysiology.

[13]  David Poeppel,et al.  Cortical oscillations and speech processing: emerging computational principles and operations , 2012, Nature Neuroscience.

[14]  Emmanuel Dupoux,et al.  Perceptual adjustment to highly compressed speech: effects of talker and rate changes. , 1997, Journal of experimental psychology. Human perception and performance.

[15]  Hagai Attias,et al.  Temporal Low-Order Statistics of Natural Sounds , 1996, NIPS.

[16]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[17]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[18]  A. M. Mimpen,et al.  The ear as a frequency analyzer. II. , 1964, The Journal of the Acoustical Society of America.

[19]  Judit Gervain,et al.  The neural correlates of processing scale-invariant environmental sounds at birth , 2016, NeuroImage.

[20]  R Drullman,et al.  Temporal envelope and fine structure cues for speech intelligibility. , 1994, The Journal of the Acoustical Society of America.

[21]  Xiaoqin Wang,et al.  Spectral integration in A1 of awake primates: neurons with single- and multipeaked tuning characteristics. , 2003, Journal of neurophysiology.

[22]  N. C. Singh,et al.  Modulation spectra of natural sounds and ethological theories of auditory processing. , 2003, The Journal of the Acoustical Society of America.

[23]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[24]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[25]  Anne Hsu,et al.  Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds , 2005, Nature Neuroscience.

[26]  M M Merzenich,et al.  Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics. , 1995, Journal of neurophysiology.

[27]  M. Tomasello,et al.  Joint attention and early language. , 1986, Child development.

[28]  S. Pinker,et al.  The faculty of language: what's special about it? , 2005, Cognition.

[29]  E Ahissar,et al.  Speech comprehension is correlated with temporal response patterns recorded from auditory cortex , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Wiktor Mlynarski,et al.  Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation , 2013, Front. Comput. Neurosci..

[31]  E. Oja,et al.  Independent Component Analysis , 2013 .

[32]  David Poeppel,et al.  The analysis of speech in different temporal integration windows: cerebral lateralization as 'asymmetric sampling in time' , 2003, Speech Commun..

[33]  M. S. Keshner 1/f noise , 1982, Proceedings of the IEEE.

[34]  Maria N. Geffen,et al.  Category-Specific Processing of Scale-Invariant Sounds in Infancy , 2014, PloS one.

[35]  D. Mackay,et al.  Towards an information-flow model of human behaviour. , 1956, British journal of psychology.

[36]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[37]  K. Sen,et al.  Spectral-temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds , 2022 .

[38]  Nima Mesgarani,et al.  Phoneme representation and classification in primary auditory cortex. , 2008, The Journal of the Acoustical Society of America.

[39]  Matthew J Goupell,et al.  Speech perception in simulated electric hearing exploits information-bearing acoustic change. , 2013, The Journal of the Acoustical Society of America.

[40]  W. Bialek,et al.  Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents , 1995, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[41]  Jean-Marc Edeline,et al.  A Spike-Timing Code for Discriminating Conspecific Vocalizations in the Thalamocortical System of Anesthetized and Awake Guinea Pigs , 2009, The Journal of Neuroscience.

[42]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[43]  J. Fritz,et al.  Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex , 2003, Nature Neuroscience.

[44]  Eero P. Simoncelli,et al.  Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .

[45]  G. Csibra,et al.  Natural pedagogy , 2009, Trends in Cognitive Sciences.

[46]  L. Robles,et al.  Mechanics of the mammalian cochlea. , 2001, Physiological reviews.

[47]  J. Schnupp,et al.  Tuning to Natural Stimulus Dynamics in Primary Auditory Cortex , 2006, Current Biology.

[48]  Aniruddh D. Patel,et al.  Temporal modulations in speech and music , 2017, Neuroscience & Biobehavioral Reviews.

[49]  Keith Johnson,et al.  Phonetic Feature Encoding in Human Superior Temporal Gyrus , 2014, Science.

[50]  Jos J Eggermont,et al.  Neuronal responses in cat primary auditory cortex to natural and altered species-specific calls , 2000, Hearing Research.

[51]  Christian Lorenzi,et al.  A cross-linguistic study of speech modulation spectra. , 2017, The Journal of the Acoustical Society of America.

[52]  A M Liberman,et al.  Perception of the speech code. , 1967, Psychological review.

[53]  J. Werker,et al.  Tuned to the signal: the privileged status of speech for young infants. , 2004, Developmental science.

[54]  Robert C. Liu,et al.  Auditory Cortical Detection and Discrimination Correlates with Communicative Significance , 2007, PLoS biology.

[55]  Timothy Q Gentner,et al.  Central auditory neurons have composite receptive fields , 2016, Proceedings of the National Academy of Sciences.

[56]  Isaac M. Carruthers,et al.  Stable encoding of sounds over a broad range of statistical parameters in the auditory cortex , 2016, The European journal of neuroscience.

[57]  Yizhar Lavner,et al.  Perceptual Learning of Time-Compressed Speech: More than Rapid Adaptation , 2012, PloS one.

[58]  Lee M. Miller,et al.  Naturalistic Auditory Contrast Improves Spectrotemporal Coding in the Cat Inferior Colliculus , 2003, The Journal of Neuroscience.

[59]  Christopher K. Kovach,et al.  Temporal Envelope of Time-Compressed Speech Represented in the Human Auditory Cortex , 2009, The Journal of Neuroscience.

[60]  Judit Gervain,et al.  Auditory Perception of Self-Similarity in Water Sounds , 2011, Front. Integr. Neurosci..

[61]  Isaac M. Carruthers,et al.  Emergence of invariant representation of vocalizations in the auditory cortex. , 2015, Journal of neurophysiology.

[62]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[63]  R. Voss,et al.  ‘1/fnoise’ in music and speech , 1975, Nature.

[64]  Israel Nelken,et al.  Responses of auditory-cortex neurons to structural features of natural sounds , 1999, Nature.

[65]  Andrew Y. Ng,et al.  Unsupervised learning models of primary cortical receptive fields and receptive field plasticity , 2011, NIPS.

[66]  Powen Ru,et al.  Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.

[67]  J. Mehler,et al.  Perceptual adjustment to time-compressed speech: A cross-linguistic study , 1998, Memory & cognition.

[68]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[69]  David Poeppel,et al.  Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence , 2013, Front. Hum. Neurosci..

[70]  J. Mehler,et al.  Adaptation to time-compressed speech: Phonological determinants , 2000, Perception & psychophysics.

[71]  Wiktor Mlynarski,et al.  The Opponent Channel Population Code of Sound Location Is an Efficient Representation of Natural Binaural Sounds , 2015, PLoS Comput. Biol..

[72]  Cécile Issard,et al.  Adult-like processing of time-compressed speech by newborns: A NIRS study , 2016, Developmental Cognitive Neuroscience.