Subjective Test Methodology: MOS vs. DMOS in Evaluation of Speech Coding Algorithms

Speech compression algorithms and other speech processing techniques are used extensively in telecommunications applications. The criteria for acceptability for public and private network applications include intelligibility, but user acceptance requirements are so stringent that intelligibility tests are of limited value in differentiating between viable speech coding altematives. Instead, more appropriate methods assess the quality globally, over whatever dimensions the listener deems appropriate. These dimensions include naturalness, pleasantness, absence of artifacts, and so on. Two methods used frequently to estimate performance for telecommunications are the mean opinion score (MOS, also known as absolute category rating) and degradation mean opinion score (DMOS). In the MOS procedure, listeners rate individual samples processed through each codec on a 5-point scale. In the DMOS procedure, listeners hear a sample processed through a reference codec followed by the same source sample processed through the test codec, and rate the degradation of the second sample on a 5point scale. This paper will compare and contrast these two methods using data