Perceptual Objective Listening Quality Assessment ( POLQA ) , The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part II – Perceptual Model

In two closely related papers we present POLQA (Perceptual Objective Listening Quality Assessment), the third generation perceptual objective speech quality measurement algorithm, standardized by the International Telecommunication Union (ITU-T) as Recommendation P.863 in 2011. This measurement algorithm simulates subjects that rate the quality of a speech fragment in a listening test using a five-point opinion scale. The new standard provides a significantly improved performance in predicting the subjective speech quality in terms of Mean Opinion Scores when compared to PESQ (Perceptual Evaluation of Speech Quality), the second generation of objective speech quality measurements. The new POLQA algorithm allows for predicting speech quality over a wide range of distortions, from “High Definition” super-wideband speech (HD Voice, audio bandwidth up to 14 kHz) to extremely distorted narrowband telephony speech (audio bandwidth down to 2 kHz), using sample rates between 48 and 8 kHz. POLQA is suited for distortions that are outside the scope of PESQ such as linear frequency response distortions, time stretching/compression as found in Voice-over-IP, certain types of codec distortions, reverberations, and the impact of playback volume. POLQA outperforms PESQ in assessing any kind of degradation making it an ideal tool for all speech quality measurements in today’s and future mobile and IP based networks. This paper (Part II) outlines the core elements of the underlying perceptual model and presents the final results.