Predicting the perceptive judgment of voices in a telecom context: selection of acoustic parameters

Perception of vocal styles is of paramount importance in vocal server application as the global style of a telecom service is highly dependant on the voice used. In this work we develop tools for automatic inference of perceived vocal styles for a set of 100 vocal sequences. In a first stage, twenty subjective evaluation criteria have been identified by running perceptive experiments with naïve listeners. In a second stage, the vocal sequences have been parameterised using more than a hundred acoustic features representing prosody, spectral energy distribution, articulation and waveform. Then, regression analysis and neural networks are used for predicting the subjective score of each voice for each subjective criterion. The results show that the prediction error is generally low: it seems possible to predict automatically the perceived quality of the sequences. Moreover, the prediction error decreases when non-significant parameters are removed.

[1]  Dirk Michaelis,et al.  Acoustic "breathiness measures" in the description of pathologic voices , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  V Maffiolo,et al.  The emotional quality of speech in voice services , 2003, Ergonomics.

[3]  K. Stevens,et al.  Emotions and speech: some acoustical correlates. , 1972, The Journal of the Acoustical Society of America.

[4]  V. V. van Heuven,et al.  Spectral balance as a cue in the perception of linguistic stress. , 1997, The Journal of the Acoustical Society of America.

[5]  Hideo Saito,et al.  Evaluation of the relationship between emotional concepts and emotional parameters on speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  A. Paeschke,et al.  F0-CONTOURS IN EMOTIONAL SPEECH , 1999 .

[7]  Valery A. Petrushin,et al.  EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS , 1999 .

[8]  Francesco Cutugno,et al.  APA: towards an Automatic Tool for Prosodic Analysis , 2002 .

[9]  Hazim Kemal Ekenel,et al.  Role of Intonation Patterns in Conveying Emotion In Speech , 2003 .