Analyzing the expression of annoyance during phone calls to complaint services

The identification of emotional hints from speech shows a large number of applications. Machine learning researchers have analyzed sets of acoustic parameters as potential cues for the identification of discrete emotional categories or, alternatively, of the dimensions of emotions. Experiments have been carried out over records including simulated or induced emotions, even if recently more research has been carried out on spontaneous emotions. However, it is well known that emotion expression depends not only on cultural factors but also on the individual and also on the specific situation. In this work we deal with the tracking of annoyance shifts during real phone-calls to complaint services. The audio files analyzed show different ways to express annoyance, as, for example, disappointment, impotence or anger. However variations of parameters derived from intensity combined with some spectral information and suprasegmental features have shown to be very robust for each speaker and annoyance rate. The work also discussed the annotation problem and proposed an extended rating scale in order to include annotators disagreements. Our frame classification results validated the annotation procedure. Experimental results also showed that shifts in customer annoyance rates could be potentially tracked during phone calls.

[1]  Jeffrey M Girard,et al.  Automated Audiovisual Depression Analysis. , 2015, Current opinion in psychology.

[2]  Wendi B. Heinzelman,et al.  Emotion classification: How does an automated system compare to Naive human coders? , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Jean-Philippe Thiran,et al.  Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data , 2015, Pattern Recognit. Lett..

[4]  Ning An,et al.  Speech Emotion Recognition Using Fourier Parameters , 2015, IEEE Transactions on Affective Computing.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Björn W. Schuller,et al.  AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge , 2014, AVEC '14.

[7]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[8]  Björn W. Schuller,et al.  Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies , 2008, INTERSPEECH.

[9]  Miguel Angel Ferrer-Ballester,et al.  Nonlinear dynamics characterization of emotional speech , 2014, Neurocomputing.

[10]  P. Baranyi,et al.  Definition and synergies of cognitive infocommunications , 2012 .

[11]  Frank Rudzicz,et al.  Prosody and Semantics Are Separate but Not Separable Channels in the Perception of Emotional Speech: Test for Rating of Emotions in Speech. , 2016, Journal of speech, language, and hearing research : JSLHR.

[12]  Zoraida Callejas Carrión,et al.  Sentiment Analysis: From Opinion Mining to Human-Agent Interaction , 2016, IEEE Transactions on Affective Computing.

[13]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[14]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[15]  Fabien Ringeval,et al.  Continuous Estimation of Emotions in Speech by Dynamic Cooperative Speaker Models , 2017, IEEE Transactions on Affective Computing.

[16]  Theodoros Iliou,et al.  Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 , 2012, Artificial Intelligence Review.

[17]  Marcos Faundez-Zanuy,et al.  Recent Advances in Nonlinear Speech Processing , 2016, Smart Innovation, Systems and Technologies.

[18]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[19]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[20]  Dolores E. López,et al.  Speech in Alzheimer's Disease: Can Temporal and Acoustic Parameters Discriminate Dementia? , 2014, Dementia and Geriatric Cognitive Disorders.

[21]  Laurence Devillers,et al.  Detection of real-life emotions in call centers , 2005, INTERSPEECH.

[22]  Mark A. Clements,et al.  Multimodal Affect Classification at Various Temporal Lengths , 2015, IEEE Transactions on Affective Computing.

[23]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.