Objective assessment of speech quality by combining Bark- and Mel-scale frequency

Perceptual evaluation of speech quality (PESQ, ITU-T P.862) is a well known objective method for speech quality assessment. PESQ applies Bark-scale frequency to estimate the mean opinion score (MOS) for end-to-end speech quality assessment of narrow-band telephone networks and speech codec. This paper discusses a new objective estimation method by combining Bark-scale and Mel-scale frequency to improve the accuracy of PESQ. The objective assessment based on Mel-scale frequency is presented by following the PESQ framework and then they are combined together through score fusion. Experiment results shows that the objective score of the estimation method using Mel-scale frequency alone has good correlation with the subjective score. Comparative results show improvement in cases where Bark-scale frequency is combined with Mel-scale frequency.

[1]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[2]  Zhuoqun Sun,et al.  Voice quality prediction models and their application in VoIP networks , 2006, IEEE Transactions on Multimedia.

[3]  Ilan D. Shallom,et al.  Enhanced PESQ algorithm for objective assessment of speech quality at a continuous varying delay , 2009, 2009 International Workshop on Quality of Multimedia Experience.

[4]  I.V. McLoughlin,et al.  A Methodology for Improving PESQ accuracy for Chinese Speech , 2005, TENCON 2005 - 2005 IEEE Region 10 Conference.