Distinct neural ensemble response statistics are associated with recognition and discrimination of natural sound textures

The perception of sound textures, a class of natural sounds defined by statistical sound structure such as fire wind, and rain, has been proposed to arise through the integration of time-averaged summary statistics. Where and how the auditory system might encode these summary statistics to create internal representations of these stationary sounds, however, is unknown. Here, using natural textures and synthetic variants with reduced statistics, we show that summary statistics modulate the correlations between frequency organized neuron ensembles in the awake rabbit inferior colliculus. These neural ensemble correlation statistics capture high-order sound structure and allow for accurate neural decoding in a single trial recognition task with evidence accumulation times approaching 1 s. In contrast, the average activity across the neural ensemble (neural spectrum) provides a fast (tens of ms) and salient signal that contributes primarily to texture discrimination. Intriguingly, perceptual studies in human listeners reveals analogous trends: the sound spectrum is integrated quickly and serves as salient discrimination cue while high-order sound statistics are integrated slowly and contribute substantially more towards recognition. The findings suggest statistical sound cues such as the sound spectrum and correlation structure are represented by distinct response statistics in auditory midbrain ensembles, and that these neural response statistics may have dissociable roles and time scales for the recognition and discrimination of natural sounds. SIGNIFICANCE STATEMENT Being able to recognize and discriminate natural sounds, such as from a running stream, a crowd clapping, or ruffling leaves is a critical task of the normal functioning auditory system. Humans can easily perform such tasks, yet they can be particularly difficult for the hearing impaired and they challenge our most sophisticated computer algorithms. This difficulty is attributed to the complex physical structure of such natural sounds and the fact they are not unique: they vary randomly in a statistically defined manner from one excerpt to the other. Here we provide the first evidence, to our knowledge, that the central auditory system is able to encode and utilize statistical sound cues for natural sound recognition and discrimination behaviors.

[1]  Benjamin J. Balas,et al.  Texture synthesis and perception: Using computational models to study texture representations in the human visual system , 2006, Vision Research.

[2]  Xiaoqin Wang,et al.  Contrast Tuning in Auditory Cortex , 2003, Science.

[3]  Eero P. Simoncelli,et al.  Summary statistics in auditory perception , 2013, Nature Neuroscience.

[4]  Hagai Attias,et al.  Coding of Naturalistic Stimuli by Auditory Midbrain Neurons , 1997, NIPS.

[5]  Chen Chen,et al.  Precise Feature Based Time Scales and Frequency Decorrelation Lead to a Sparse Auditory Code , 2012, The Journal of Neuroscience.

[6]  T. Dau,et al.  Cascaded Amplitude Modulations in Sound Texture Perception , 2017, Front. Neurosci..

[7]  Wolf Singer,et al.  Stimulus complexity shapes response correlations in primary visual cortex , 2019, Proceedings of the National Academy of Sciences.

[8]  H. Komatsu,et al.  Image statistics underlying natural texture selectivity of neurons in macaque V4 , 2014, Proceedings of the National Academy of Sciences.

[9]  Eero P. Simoncelli,et al.  Selectivity and tolerance for visual texture in macaque V2 , 2016, Proceedings of the National Academy of Sciences.

[10]  E. C. Cherry,et al.  Mechanism of Binaural Fusion in the Hearing of Speech , 1957 .

[11]  J. Atick,et al.  STATISTICS OF NATURAL TIME-VARYING IMAGES , 1995 .

[12]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[13]  Xiu Zhai,et al.  A neural ensemble correlation code for sound category identification , 2018, bioRxiv.

[14]  Neil C. Rabinowitz,et al.  Contrast Gain Control in Auditory Cortex , 2011, Neuron.

[15]  Anne Hsu,et al.  Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds , 2005, Nature Neuroscience.

[16]  M. Escabí,et al.  Distinct Roles for Onset and Sustained Activity in the Neuronal Code for Temporal Periodicity and Acoustic Envelope Shape , 2008, The Journal of Neuroscience.

[17]  Eero P. Simoncelli,et al.  Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .

[18]  J. Schnupp,et al.  Periodotopy in the gerbil inferior colliculus: local clustering rather than a gradient map , 2015, Front. Neural Circuits.

[19]  Judit Gervain,et al.  Auditory Perception of Self-Similarity in Water Sounds , 2011, Front. Integr. Neurosci..

[20]  C. Schreiner,et al.  Nonlinear Spectrotemporal Sound Analysis by Neurons in the Auditory Midbrain , 2002, The Journal of Neuroscience.

[21]  Monty A Escabí,et al.  Neural Modulation Tuning Characteristics Scale to Efficiently Encode Natural Sound Statistics , 2010, The Journal of Neuroscience.

[22]  S Shamma,et al.  The case of the missing pitch templates: how harmonic templates emerge in the early auditory system. , 2000, The Journal of the Acoustical Society of America.

[23]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[24]  L A JEFFRESS,et al.  A place theory of sound localization. , 1948, Journal of comparative and physiological psychology.

[25]  Israel Nelken,et al.  Responses of auditory-cortex neurons to structural features of natural sounds , 1999, Nature.

[26]  C. Schreiner,et al.  Periodicity coding in the inferior colliculus of the cat. II. Topographical organization. , 1988, Journal of neurophysiology.

[27]  Lee M. Miller,et al.  Naturalistic Auditory Contrast Improves Spectrotemporal Coding in the Cat Inferior Colliculus , 2003, The Journal of Neuroscience.

[28]  C E Schreiner,et al.  Neural processing of amplitude-modulated sounds. , 2004, Physiological reviews.