This paper describes the reliability of scores obtained in subjective evaluation tests of picture quality using the double-stimulus continuous quality-scale method (the DSCQS method) and the single-stimulus five-grade quality scale method (the SSQS method). First, it is shown that the mean scores with the absolute scale of the DSCQS method do not extend to a higher grade, unlike those with the SSQS method, and that the capability of assessors to recognize small quality differences in the absolute scale of the DSCQS method is inferior to that in the SSQS method. Next, the reason why the results obtained using two methods are different even with the same five-grade quality scale is investigated. It is found that the scale divisions on the vertical line described in the score sheet of the DSCQS method can affect significantly the distribution of scores and also the capability of assessors to recognize small quality differences. Finally, it is concluded that it is better to replace the graph scale with the continuous five-grade category scale to improve the reliability of the DSCQS method.
[1]
J. W. Allnatt.
Subjective assessment method for television digital codecs
,
1980
.
[2]
Eisuke Nakasu,et al.
Development of 135 Mbit/s HDTV codec
,
1992,
Signal Process. Image Commun..
[3]
J. W. Allnatt,et al.
Double-stimulus quality rating method for television digital codecs
,
1980
.
[4]
William Volk,et al.
Applied statistics for engineers
,
1971
.
[5]
N. Narita.
Subjective-evaluation method for quality of coded images
,
1994
.
[6]
Nagato Narita.
Effect of Impairment Ranges on Reliability of the Modified EBU Method
,
1995
.
[7]
C. Osgood,et al.
Factor analysis of meaning.
,
1955,
Journal of experimental psychology.