Using a quantitative psychoacoustical signal representation for objective speech quality measurement

This paper describes the application of a quantitative psychoacoustical signal preprocessing model for objective speech quality measurement. The preprocessing is applied to transform the original and the distorted speech signal to an internal representation which is thought of as the information that is accessible to higher neural stages of perception. From a comparison of these internal representations a quality measure can be derived that shows a high correlation to the subjective MOS data of various test databases. The inherent parameters of the preprocessing model were derived directly from psychoacoustical data independent of the present study. The detection thresholds of codec-like distortions obtained in a psychoacoustical experiment could also be predicted by the model. This indicates that the internal representation contains the relevant information for detecting perceivable differences. It provides evidence for a direct relation between speech quality and detectability of a distortion.

[1]  John G. Beerends,et al.  A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation , 1992 .

[2]  T Dau,et al.  A quantitative model of the "effective" signal processing in the auditory system. I. Model structure. , 1996, The Journal of the Acoustical Society of America.

[3]  Allen Gersho,et al.  Auditory distortion measure for speech coding , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.