Modulation spectra of natural sounds and ethological theories of auditory processing.

The modulation statistics of natural sound ensembles were analyzed by calculating the probability distributions of the amplitude envelope of the sounds and their time-frequency correlations given by the modulation spectra. These modulation spectra were obtained by calculating the two-dimensional Fourier transform of the autocorrelation matrix of the sound stimulus in its spectrographic representation. Since temporal bandwidth and spectral bandwidth are conjugate variables, it is shown that the joint modulation spectrum of sound occupies a restricted space: sounds cannot have rapid temporal and spectral modulations simultaneously. Within this restricted space, it is shown that natural sounds have a characteristic signature. Natural sounds, in general, are low-passed, showing most of their modulation energy for low temporal and spectral modulations. Animal vocalizations and human speech are further characterized by the fact that most of the spectral modulation power is found only for low temporal modulation. Similarly, the distribution of the amplitude envelopes also exhibits characteristic shapes for natural sounds, reflecting the high probability of epochs with no sound, systematic differences across frequencies, and a relatively uniform distribution for the log of the amplitudes for vocalizations. It is postulated that the auditory system as well as engineering applications may exploit these statistical properties to obtain an efficient representation of behaviorally relevant sounds. To test such a hypothesis we show how to create synthetic sounds with first and second order envelope statistics identical to those found in natural sounds.

[1]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[2]  S. S. Stevens The direct estimation of sensory magnitudes-loudness. , 1956, The American journal of psychology.

[3]  M. Paez,et al.  Minimum Mean-Squared-Error Quantization in Speech PCM and DPCM Systems , 1972, IEEE Trans. Commun..

[4]  J. Newman,et al.  Multiple coding of species-specific vocalizations in the auditory cortex of squirrel monkeys. , 1973, Brain research.

[5]  M. Sachs,et al.  Rate versus level functions for auditory-nerve fibers in cats: tone-burst stimuli. , 1974, The Journal of the Acoustical Society of America.

[6]  R. Voss,et al.  ‘1/fnoise’ in music and speech , 1975, Nature.

[7]  N. Suga,et al.  Cortical neurons sensitive to combinations of information-bearing elements of biosonar signals in the mustache bat. , 1978, Science.

[8]  N. Viemeister Temporal modulation transfer functions based upon modulation thresholds. , 1979, The Journal of the Acoustical Society of America.

[9]  J. L. Flanagan,et al.  Parametric coding of speech spectra , 1980 .

[10]  E. Evans,et al.  Intensity coding in the auditory periphery of the cat: Responses of cochlear nerve and cochlear nucleus neurons to signals in the presence of bandstop masking noise , 1982, Hearing Research.

[11]  D. Margoliash Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow , 1983, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  A. Aertsen,et al.  Prediction of the responses of auditory neurons in the midbrain of the grass frog based on the spectro-temporal receptive field , 1983, Hearing Research.

[13]  Jae Lim,et al.  Signal estimation from modified short-time Fourier transform , 1984 .

[14]  D. M. Green ‘Frequency’ and the Detection of Spectral Shape Change , 1986 .

[15]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[16]  D. P. Phillips Neural representation of sound amplitude in the auditory cortex: effects of noise masking , 1990, Behavioural Brain Research.

[17]  Mario A. Ruggero,et al.  Application of a commercially-manufactured Doppler-shift laser velocimeter to the measurement of basilar-membrane vibration , 1991, Hearing Research.

[18]  R. Fay,et al.  The Mammalian auditory pathway : neurophysiology , 1992 .

[19]  Ce Schreiner,et al.  Spectral envelope coding in cat primary auditory cortex: Properties of ripple transfer functions , 1994 .

[20]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[21]  W. Bialek,et al.  Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents , 1995, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[22]  R Drullman,et al.  Temporal envelope and fine structure cues for speech intelligibility. , 1994, The Journal of the Acoustical Society of America.

[23]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[24]  R C Reid,et al.  Efficient Coding of Natural Scenes in the Lateral Geniculate Nucleus: Experimental Test of a Computational Theory , 1996, The Journal of Neuroscience.

[25]  Hagai Attias,et al.  Temporal Low-Order Statistics of Natural Sounds , 1996, NIPS.

[26]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[27]  R. Schlauch,et al.  Basilar membrane nonlinearity and loudness. , 1998, The Journal of the Acoustical Society of America.

[28]  R. Dooling,et al.  Detection of changes in timbre and harmonicity in complex sounds by zebra finches (Taeniopygia guttata) and budgerigars (Melopsittacus undulatus). , 1998, Journal of comparative psychology.

[29]  C. Schreiner,et al.  Spectral envelope coding in cat primary auditory cortex: linear and non‐linear effects of stimulus characteristics , 1998, The European journal of neuroscience.

[30]  David R. Brillinger,et al.  An investigation of the second- and higher-order spectra of music , 1998, Signal Process..

[31]  Detection of changes in timbre and harmonicity in complex sounds by zebra finches (Taeniopygia guttata) and budgerigars (Melopsittacus undulatus). , 1998, Journal of comparative psychology.

[32]  M. Merzenich,et al.  Optimizing sound features for cortical neurons. , 1998, Science.

[33]  A. Doupe,et al.  Temporal and Spectral Sensitivity of Complex Auditory Neurons in the Nucleus HVc of Male Zebra Finches , 1998, The Journal of Neuroscience.

[34]  S. Shamma,et al.  Spectro-temporal modulation transfer functions and speech intelligibility. , 1999, The Journal of the Acoustical Society of America.

[35]  K. Sen,et al.  Spectral-temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds , 2022 .

[36]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[37]  K. Sen,et al.  Feature analysis of natural sounds in the songbird auditory forebrain. , 2001, Journal of neurophysiology.

[38]  S A Shamma,et al.  Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.

[39]  Christian K. Machens,et al.  Representation of Acoustic Communication Signals by Insect Auditory Receptor Neurons , 2001, The Journal of Neuroscience.

[40]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[41]  Roman Borisyuk,et al.  Oscillatory model of novelty detection. , 2001 .

[42]  N. C. Singh,et al.  Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli , 2001 .

[43]  Lee M. Miller,et al.  Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex. , 2002, Journal of neurophysiology.

[44]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[45]  J. Eggermont Temporal modulation transfer functions in cat primary auditory cortex: separating stimulus effects from neural mechanisms. , 2002, Journal of neurophysiology.

[46]  C. Schreiner,et al.  Nonlinear Spectrotemporal Sound Analysis by Neurons in the Auditory Midbrain , 2002, The Journal of Neuroscience.

[47]  N. C. Singh,et al.  Selectivity for conspecific song in the zebra finch auditory forebrain. , 2003, Journal of neurophysiology.

[48]  D. P. Phillips,et al.  Responses of single neurons in cat auditory cortex to time-varying stimuli: linear amplitude modulations , 2004, Experimental Brain Research.

[49]  J. H. van Hateren,et al.  A theory of maximizing sensory information , 2004, Biological Cybernetics.

[50]  J. H. Hateren,et al.  Theoretical predictions of spatiotemporal receptive fields of fly LMCs, and experimental validation , 1992, Journal of Comparative Physiology A.

[51]  Jonathan Z. Simon,et al.  Robust Spectrotemporal Reverse Correlation for the Auditory System: Optimizing Stimulus Design , 2000, Journal of Computational Neuroscience.