This study investigates the perception of speech quality over telephone channels with time-varying transmission characteristics for simulated conversational structures. The aim is to establish a relationship between subjective quality associated with short speech samples (5―6 seconds) and quality associated with overall conversations (1― 2 minutes). Two two-part experiments were conducted. In the first part of each experiment, dialog-final ratings within the temporal structure of a telephone conversation were assessed. Varying transmission characteristics were realized with ten different degradation profiles of preprocessed speech samples obtained mainly from real mobile channels to ensure authentic types of degradation. The second part was carried out to obtain separate short-term ratings of the speech samples used in the first part. Experiments 1 and 2 tested different conversation durations (1 and 2 minutes). The results demonstrate that dialog-final ratings vary with respect to the degradation profile, revealing a recency effect and a strong impact of individual bad samples. Two related models which implement these findings are presented. With these models, dialog-final quality ratings can be estimated significantly better than by plain averaging of short sample ratings (about 10% absolute improvement). They also perform better than two algorithms taken from literature. Both models can be applied to the instrumental method described in ITU-T Rec. P.862 [1], resulting in about 13% absolute improvement. They were evaluated with the results of two different experiments, which were performed independently but on the basis of our test procedure. In these experiments similar profiles but a different type of quality degradation, different sample durations, and different speech material were used. The models proved to be valid and reliable for the time span investigated (1―2 minutes) and for the profiles used. One of them is now being recommended by the ETSI STQ mobile group.
[1]
Phil Gray,et al.
An Experimental Investigation of the Accumulation of Perceived Error in Time-Varying Speech Distortions
,
1997
.
[2]
Alexander Raake,et al.
Speech Quality of VoIP - Assessment and Prediction
,
2006
.
[3]
Sylvain Busson,et al.
Effects of context on the subjective assessment of time-varying speech quality : Listening / conversation, laboratory / real environment
,
2004
.
[4]
Alan Clark,et al.
Modeling the effects of burst packet loss and recency on subjective voice quality
,
2001
.
[5]
N. Cowan.
On short and long auditory stores.
,
1984,
Psychological bulletin.
[6]
Régine Le Bouquin-Jeannès,et al.
On the Evaluation of the Conversational Speech Quality in Telecommunications
,
2008,
EURASIP J. Adv. Signal Process..