Envelope analysis methods for tonality estimation

In perceptual audio coders, the audio signal masks the quantization noise. The masking effectiveness depends on the degree of tonality/noisiness of the signal. Hence, in psychoacoustic models (PM) of perceptual coders, the level of the estimated masking thresholds can be adjusted by tonality estimation methods. This paper introduces three envelope analysis methods for tonality estimation: optimized amplitude modulation ratio (AM-R), auditory image correlation, and temporal envelope rate. The methods were implemented in a filter bank-based PM. In a subjective quality test, they were compared to each other and to another existing method, partial spectral flatness measure (PSFM). The PSFM and the AM-R were rated significantly higher than the other methods.

[1]  DeLiang Wang,et al.  Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[2]  Joseph L. Hall,et al.  Asymmetry of masking revisited: Generalization of masker and probe bandwidth , 1997 .

[3]  E. de Boer,et al.  Masking and Discrimination , 1966 .

[4]  J. D. Johnston,et al.  Estimation of perceptual entropy using noise masking criteria , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[6]  Roy D. Patterson,et al.  Aim-mat: the auditory image model in MATLAB , 2004 .

[7]  Roy D. Patterson,et al.  A FUNCTIONAL MODEL OF NEURAL ACTIVITY PATTERNS AND AUDITORY IMAGES , 2004 .

[8]  Jörn Ostermann,et al.  Combination of Different Perceptual Models with Different Audio Transform Coding Schemes:Implementation and Evaluation , 2010 .

[9]  Jesko L Verhey Modeling the influence of inherent envelope fluctuations in simultaneous masking experiments. , 2002, The Journal of the Acoustical Society of America.

[10]  R. Hellman Asymmetry of masking between noise and tone , 1972 .

[11]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[12]  Armin Taghipour,et al.  Comparison of two tonality estimation methods used in a psychoacoustic model , 2014, 2014 International Conference on Audio, Language and Image Processing.

[13]  Armin Taghipour,et al.  A psychoacoustic model with Partial Spectral Flatness Measure for tonality estimation , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[14]  Roy D. Patterson,et al.  Auditory images:How complex sounds are represented in the auditory system , 2000 .

[15]  Brian C J Moore,et al.  Asymmetry of masking between complex tones and noise: the role of temporal structure and peripheral compression. , 2002, The Journal of the Acoustical Society of America.

[16]  B. Moore,et al.  Masked threshold for noise bands masked by narrower bands of noise: Effects of masker bandwidth and center frequency. , 2016, The Journal of the Acoustical Society of America.

[17]  Armin Taghipour,et al.  Durations required to distinguish noise and tone: Effects of noise bandwidth and frequency. , 2016, The Journal of the Acoustical Society of America.