Cloud-Based Sentiment Analysis for Interactive Agents

Emotions play an important role in human-agent interaction. To realise natural interaction it is essential for an agent to be able to analyse the sentiment in users' utterances. Modern agents use a distributed service model in which their functions can be located on any number of computers including cloud-based servers. Outsourcing the speech recognition and sentiment analysis to a cloud service enables even simple agents to adapt their behaviour to the emotional state of their users. In this study we test whether sentiment analysis tools can accurately gauge sentiment in human-chatbot interaction. To that effect, we compare the quality of sentiment analysis obtained from three major suppliers of cloud-based sentiment analysis services (Microsoft, Amazon and Google). In addition, we compare their results with the leading lexicon-based software, as well as with human ratings. The results show that although the sentiment analysis tools agree moderately with each other, they do not correlate well with human ratings. While the cloud-based services would be an extremely useful tool for human-agent interaction, their current quality does not justify their usage in human-agent conversations.

[1]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[2]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[3]  Davide Marengo,et al.  Sharing feelings online: studying emotional well-being via automated text analysis of Facebook posts , 2015, Front. Psychol..

[4]  Barbara Plank,et al.  Strong Baselines for Neural Semi-Supervised Learning under Domain Shift , 2018, ACL.

[5]  Henriette Cramer,et al.  Effects of (in)accurate empathy and situational valence on attitudes towards robots , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[6]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[7]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[8]  Fabrício Benevenuto,et al.  A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods , 2015, ArXiv.

[9]  Xin Li,et al.  What in Consumer Reviews Affects the Sales of Mobile Apps: A Multifacet Sentiment Analysis Approach , 2015, Int. J. Electron. Commer..

[10]  Alessandro Moschitti,et al.  Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[11]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[12]  Robin R. Murphy,et al.  Survey of Non-facial/Non-verbal Affective Expressions for Appearance-Constrained Robots , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  C. Bartneck,et al.  Comparing the Similarity of Responses Received from Studies in Amazon’s Mechanical Turk to Studies Conducted Online and with Direct Recruitment , 2015, PloS one.

[14]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[15]  Verena Rieser,et al.  #MeToo Alexa: How Conversational Systems Respond to Sexual Harassment , 2018, EthNLP@NAACL-HLT.

[16]  Tong Zhang,et al.  Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.

[17]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[18]  Merja Mahrt,et al.  Twitter and society [Digital Formations, Volume 89] , 2014 .

[19]  Walid Maalej,et al.  How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews , 2014, 2014 IEEE 22nd International Requirements Engineering Conference (RE).

[20]  Marie-Francine Moens,et al.  Recognising Personality Traits Using Facebook Status Updates , 2013, Proceedings of the International AAAI Conference on Web and Social Media.

[21]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[22]  A. R. Gilpin Table for Conversion of Kendall'S Tau to Spearman'S Rho Within the Context of Measures of Magnitude of Effect for Meta-Analysis , 1993 .

[23]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[24]  Justin Zhijun Zhan,et al.  Sentiment analysis using product review data , 2015, Journal of Big Data.

[25]  María Ruz,et al.  Emotional conflict in interpersonal interactions , 2011, NeuroImage.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[28]  Christophe Giraud-Carrier,et al.  Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality , 2016, JMIR mental health.

[29]  A. Bruns,et al.  Twitter and Society , 2013 .

[30]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[31]  Eva Hudlicka,et al.  What Are We Modeling When We Model Emotion? , 2008, AAAI Spring Symposium: Emotion, Personality, and Social Behavior.

[32]  Christoph Bartneck,et al.  Facial Expression Analysis, Modeling and Synthesis: Overcoming the Limitations of , 2009 .

[33]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[34]  J. Russell Affective space is bipolar. , 1979 .

[35]  Eleni Stroulia,et al.  On the Personality Traits of StackOverflow Users , 2013, 2013 IEEE International Conference on Software Maintenance.

[36]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[37]  Young Bin Kim,et al.  Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies , 2016, PloS one.

[38]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.