YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context

In this contribution we focus on the task of automatically analyzing a speaker’s sentiment in on-line videos containing movie reviews. In addition to textual information, we consider adding audio features as typically used in speech-based emotion recognition as well as video features encoding valuable valence information conveyed by the speaker. We combine this multi-modal experimental setup with a detailed analysis of different methods for linguistic sentiment analysis by gradually increasing the level of domain-independence: First, we consider in-domain analysis by examining a cross-validation setup applied on a novel database named Multi-Modal Movie Opinion (ICT-MMMO) corpus. Next, we concentrate on cross-domain analysis by using a large corpus of written movie reviews for training. Finally, we explore the application of on-line knowledge sources for inferring the speaker’s sentiment. Our experimental results indicate that training on written movie reviews is a promising alternative to exclusively using (spoken) in-domain data for building a system that analyses spoken movie review videos and that language-independent audiovisual analysis can compete with linguistic

[1]  Rada Mihalcea,et al.  Towards multimodal sentiment analysis: harvesting opinions from the web , 2011, ICMI '11.

[2]  Pedro Martins,et al.  MovieClouds: content-based overviews and exploratory browsing of movies , 2011, MindTrek.

[3]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[4]  Gérard G. Medioni,et al.  Context tracker: Exploring supporters and distracters in unconstrained environments , 2011, CVPR 2011.

[5]  Theresa Wilson,et al.  Multimodal Subjectivity Analysis of Multiparty Conversation , 2008, EMNLP.

[6]  Erik Cambria,et al.  Sentic Computing for social media marketing , 2012, Multimedia Tools and Applications.

[7]  Björn W. Schuller,et al.  “The Godfather” vs. “Chaos”: Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[8]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[9]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[11]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[12]  Björn W. Schuller,et al.  LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework , 2013, Image Vis. Comput..

[13]  Rada Mihalcea,et al.  Multilingual Subjectivity: Are More Languages Better? , 2010, COLING.

[14]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.