Twitter: A New Online Source of Automatically Tagged Data for Conversational Speech Emotion Recognition

In the space of affect detection in multimedia, there is a strong demand for more tagged data in order to better understand human emotions, the way they are expressed, and approaches for detecting them automatically. Unfortunately, emotion datasets are typically small due to the manual process of annotating them with emotional labels. In response, we present for the first time the application of automatically tagged Twitter data to the problem of speech emotion recognition (SER). SER has been shown to benefit from the combination of acoustic and linguistic features, albeit when the linguistic training data is from the same database as the test data. Using the presence of emoticons for automatic tagging, we compile a corpus of over 800,000 tweets that is totally independent from our evaluation database. By supplementing an acoustic classifier with linguistic information, we classify the spontaneous content within the USC-IEMOCAP corpus on valence and activation descriptors. With comparison to prior literature, we demonstrate performance improvements for valence of 2% and 6% over an acoustic-only system, using linguistic training data from Twitter and IEMOCAP respectively.

[1]  Athanasios Katsamanis,et al.  Estimation of ordinal approach-avoidance labels in dyadic interactions: Ordinal logistic regression approach , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Carlos Busso,et al.  Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions , 2009, INTERSPEECH.

[3]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[4]  Rada Mihalcea,et al.  Towards multimodal sentiment analysis: harvesting opinions from the web , 2011, ICMI '11.

[5]  Björn W. Schuller,et al.  Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification , 2012, IEEE Transactions on Affective Computing.

[6]  Amit P. Sheth,et al.  Harnessing Twitter "Big Data" for Automatic Emotion Identification , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[7]  Shrikanth S. Narayanan,et al.  Combining acoustic and language information for emotion recognition , 2002, INTERSPEECH.

[8]  Björn W. Schuller,et al.  Recognizing Affect from Linguistic Information in 3D Continuous Space , 2011, IEEE Transactions on Affective Computing.

[9]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[10]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[11]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[12]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.