Improving Instrumental Quality Prediction Performance for the Blizzard Challenge

In this paper, the performance of the standard instrumental quality prediction algorithm ITU-T P.563 is reported based on the 2007 and 2008 Blizzard Challenge speech data. The algorithm, which is optimized for natural speech, is shown to obtain poor correlation with subjective quality ratings. In an attempt to improve instrumental quality prediction performance for the Blizzard Challenge, modifications to the algorithm are proposed. In particular, a novel regression tree mapping is proposed based on five key features extracted by the P.563 algorithm. Experimental results on the 2008 Challenge dataset show that the performance attained with the improved algorithm substantially outperforms the original standard algorithm implementation.