Exploring the distribution of statistical feature parameters for natural sound textures

Sounds like “running water” and “buzzing bees” are classes of sounds which are a collective result of many similar acoustic events and are known as “sound textures”. Recent psychoacoustic study using sound textures by [1] reported that natural sounding textures can be synthesized from white noise by imposing statistical features such as marginals and correlations computed from the outputs of cochlear models responding to the textures. The outputs being the envelopes of bandpass filter responses, the ‘cochlear envelope’. This suggests that the perceptual qualities of many natural sounds derive directly from such statistical features, and raises the question of how these statistical features are distributed in the acoustic environment. To address this question, we collected a corpus of 200 sound textures from public online sources and analyzed the distributions of the textures’ marginal statistics (mean, variance, skew, and kurtosis), cross-frequency correlations and modulation power statistics. A principal component analysis of these parameters revealed a great deal of redundancy in the texture parameters. For example, just two marginal principal components, which can be thought of as measuring the sparseness or burstiness of a texture, capture as much as 66% of the variance of the 128 dimensional marginal parameter space, while the first two principal components of cochlear correlations capture as much as 90% of the variance in over 1000 correlation parameters. Knowledge of the statistical distributions documented here may help guide the choice of acoustic stimuli with high ecological validity in future research.

[1]  Andrew J King,et al.  Sensory cortex is optimized for prediction of future input , 2017, bioRxiv.

[2]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[3]  N. C. Singh,et al.  Modulation spectra of natural sounds and ethological theories of auditory processing. , 2003, The Journal of the Acoustical Society of America.

[4]  B. Julesz,et al.  Visual discrimination of textures with identical third-order statistics , 1978, Biological Cybernetics.

[5]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[6]  Eero P. Simoncelli,et al.  Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .

[7]  Xavier Serra,et al.  Freesound technical demo , 2013, ACM Multimedia.

[8]  Béla Julesz,et al.  Visual Pattern Discrimination , 1962, IRE Trans. Inf. Theory.

[9]  C.-C. Jay Kuo,et al.  Environmental sound recognition: A survey , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[10]  James R. Bergen,et al.  Pyramid-based texture analysis/synthesis , 1995, Proceedings., International Conference on Image Processing.

[11]  J. Schnupp,et al.  Tuning to Natural Stimulus Dynamics in Primary Auditory Cortex , 2006, Current Biology.

[12]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[13]  Hagai Attias,et al.  Coding of Naturalistic Stimuli by Auditory Midbrain Neurons , 1997, NIPS.

[14]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[15]  Jan W. H. Schnupp,et al.  Emergence of Tuning to Natural Stimulus Statistics along the Central Auditory Pathway , 2011, PloS one.

[16]  Hagai Attias,et al.  Temporal Low-Order Statistics of Natural Sounds , 1996, NIPS.

[17]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[18]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[19]  R. Voss,et al.  ‘1/fnoise’ in music and speech , 1975, Nature.

[20]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .