Combining speech-based and linguistic classifiers to recognize emotion in user spoken utterances

Abstract In this paper we propose to combine speech-based and linguistic classification in order to obtain better emotion recognition results for user spoken utterances. Usually these approaches are considered in isolation and even developed by different communities working on emotion recognition and sentiment analysis. We propose modeling the users emotional state by means of the fusion of the outputs generated with both approaches, taking into account information that is usually neglected in the individual approaches such as the interaction context and errors, and the peculiarities of transcribed spoken utterances. The fusion approach allows to employ different recognizers and can be integrated as an additional module in the architecture of a spoken conversational agent, using the information generated as an additional input for the dialog manager to decide the next system response. We have evaluated our proposal using three emotionally-colored databases and obtained very positive results.

[1]  Deepa Anand,et al.  Semi-supervised Aspect Based Sentiment Analysis for Movies Using Review Filtering , 2015, IHCI.

[2]  Santanu Kumar Rath,et al.  Classification of sentiment reviews using n-gram machine learning approach , 2016, Expert Syst. Appl..

[3]  Mohamed Abdel Fattah,et al.  New term weighting schemes with combination of multiple classifiers for sentiment analysis , 2015, Neurocomputing.

[4]  Rosalind W. Picard Affective Computing , 1997 .

[5]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[6]  A. Culotta,et al.  A Demographic Analysis of Online Sentiment during Hurricane Irene , 2012 .

[7]  Davide Buscaldi,et al.  From humor recognition to irony detection: The figurative language of social media , 2012, Data Knowl. Eng..

[8]  David Brown,et al.  Pharmacodynamic Modeling of Anti-Cancer Activity of Tetraiodothyroacetic Acid in a Perfused Cell Culture System , 2011, PLoS Comput. Biol..

[9]  Roberto Pieraccini The Voice in the Machine: Building Computers That Understand Speech , 2012 .

[10]  Louis Vuurpijl,et al.  An overview and comparison of voting methods for pattern recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[11]  Bracha Shapira,et al.  ConSent: Context-based sentiment analysis , 2015, Knowl. Based Syst..

[12]  Léon J. M. Rothkrantz,et al.  Emotion Recognition from Speech by Combining Databases and Fusion of Classifiers , 2010, TSD.

[13]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[14]  David Griol,et al.  The Conversational Interface: Talking to Smart Devices , 2016 .

[15]  David Griol,et al.  Modeling Users Emotional State for an Enhanced Human-Machine Interaction , 2015, HAIS.

[16]  Benoit Huet,et al.  Toward emotion indexing of multimedia excerpts , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[17]  Ramón López-Cózar,et al.  Predicting user mental states in spoken dialogue systems , 2011, EURASIP J. Adv. Signal Process..

[18]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009 .

[19]  Ruili Wang,et al.  Ensemble methods for spoken emotion recognition in call-centres , 2007, Speech Commun..

[20]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[21]  Wolfgang Minker,et al.  A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let’s Go Bus Information System , 2012, LREC.

[22]  Enrique Herrera-Viedma,et al.  Sentiment analysis: A review and comparative analysis of web services , 2015, Inf. Sci..

[23]  Zhi Liu,et al.  Sentiment recognition of online course reviews using multi-swarm optimization-based selected features , 2016, Neurocomputing.

[24]  Pei-Chann Chang,et al.  Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news , 2013, Knowl. Based Syst..

[25]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[26]  Ioannis Pitas,et al.  Automatic emotional speech classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[28]  Randy Allen Harris,et al.  Voice Interaction Design: Crafting the New Conversational Speech Systems , 2004 .

[29]  Jung-Tae Lee,et al.  A new generative opinion retrieval model integrating multiple ranking factors , 2011, Journal of Intelligent Information Systems.

[30]  Zoraida Callejas Carrión,et al.  Sentiment Analysis: From Opinion Mining to Human-Agent Interaction , 2016, IEEE Transactions on Affective Computing.

[31]  Youngja Park,et al.  Towards real-time measurement of customer satisfaction using automatically generated call transcripts , 2009, CIKM.

[32]  Chung-Hsien Wu,et al.  Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition , 2012, IEEE Transactions on Multimedia.

[33]  Jennifer Balogh,et al.  Voice User Interface Design , 2004 .

[34]  Björn Schuller,et al.  Computational Paralinguistics , 2013 .

[35]  Thierson Couto,et al.  SentiHealth-Cancer: A sentiment analysis tool to help detecting mood of patients in online social networks , 2016, Int. J. Medical Informatics.

[36]  Fernando Fernández Martínez,et al.  A satisfaction-based model for affect recognition from conversational features in spoken dialog systems , 2013, Speech Commun..

[37]  Zhihong Zeng,et al.  Audio-Visual Affect Recognition , 2007, IEEE Transactions on Multimedia.

[38]  David Griol,et al.  A Sentiment Analysis Classification Approach to Assess the Emotional Content of Photographs , 2015, ISAmI.

[39]  G. Palm,et al.  Classifier fusion for emotion recognition from speech , 2007 .

[40]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[41]  Jacques Savoy,et al.  Authorship Attribution Based on Specific Vocabulary , 2012, TOIS.

[42]  Gonzalo Navarro,et al.  Word-based self-indexes for natural language text , 2012, TOIS.

[43]  Philip Kortum,et al.  HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces , 2008 .

[44]  John H. L. Hansen,et al.  Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..

[45]  Michael Wagner,et al.  A multilevel fusion approach for audiovisual emotion recognition , 2008, AVSP.

[46]  José Manuel Perea Ortega,et al.  Sentiment analysis system adaptation for multilingual processing: The case of tweets , 2015, Inf. Process. Manag..

[47]  Jorge A. Balazs,et al.  Opinion Mining and Information Fusion: A survey , 2016, Inf. Fusion.

[48]  Julien Velcin,et al.  Sentiment analysis on social media for stock movement prediction , 2015, Expert Syst. Appl..

[49]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[50]  Yung-Ming Li,et al.  A diffusion mechanism for social advertising over microblogs , 2012, Decis. Support Syst..

[51]  Patricio Martínez-Barco,et al.  Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments , 2012, Decis. Support Syst..

[52]  Shenghuo Zhu,et al.  SumView: A Web-based engine for summarizing product reviews and customer opinions , 2013, Expert Syst. Appl..

[53]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[54]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[55]  Ramón López-Cózar,et al.  Influence of contextual information in emotion annotation for spoken dialogue systems , 2008, Speech Commun..

[56]  Gyanendra K. Verma,et al.  Multi-algorithm Fusion for Speech Emotion Recognition , 2011, ACC.

[57]  Marcel Salathé,et al.  Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and Control , 2011, PLoS Comput. Biol..

[58]  Ramón López-Cózar,et al.  A domain-independent statistical methodology for dialog management in spoken dialog systems , 2014, Comput. Speech Lang..

[59]  Erik Cambria,et al.  Fusing audio, visual and textual clues for sentiment analysis from multimodal content , 2016, Neurocomputing.

[60]  Erik Cambria,et al.  The Hourglass of Emotions , 2011, COST 2102 Training School.

[61]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[62]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[63]  Chung-Hsien Wu,et al.  Survey on audiovisual emotion recognition: databases, features, and data fusion strategies , 2014, APSIPA Transactions on Signal and Information Processing.

[64]  Harith Alani,et al.  Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[65]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[66]  Björn W. Schuller,et al.  Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification , 2012, IEEE Trans. Affect. Comput..

[67]  David Griol,et al.  A Two-Stage Combining Classifier Model for the Development of Adaptive Dialog Systems , 2016, Int. J. Neural Syst..

[68]  Yongtae Park,et al.  Review-based measurement of customer satisfaction in mobile service: Sentiment analysis and VIKOR approach , 2014, Expert Syst. Appl..