Learning and Knowledge-Based Sentiment Analysis in Movie Review Key Excerpts

We propose a data-driven approach based on back-off N-Grams and Support Vector Machines, which have recently become popular in the fields of sentiment and emotion recognition. In addition, we introduce a novel valence classifier based on linguistic analysis and the on-line knowledge sources ConceptNet, General Inquirer, and WordNet. As special benefit, this approach does not demand labeled training data. Moreover, we show how such knowledge sources can be leveraged to reduce out-of-vocabulary events in learning-based processing. To profit from both of the two generally different concepts and independent knowledge sources, we employ information fusion techniques to combine their strengths, which ultimately leads to better overall performance. Finally, we extend the data-driven classifier to solve a regression problem in order to obtain a more fine-grained resolution of valence.

[1]  Andrea Esuli,et al.  Determining Term Subjectivity and Term Orientation for Opinion Mining , 2006, EACL.

[2]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[3]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[4]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[5]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[6]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[7]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[9]  Janyce Wiebe,et al.  Identifying Collocations for Recognizing Opinions , 2001 .

[10]  Ian Witten,et al.  Data Mining , 2000 .

[11]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[12]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[13]  Catherine Havasi,et al.  ConceptNet 3 : a Flexible , Multilingual Semantic Network for Common Sense Knowledge , 2007 .

[14]  Virginia Teller Review of Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition by Daniel Jurafsky and James H. Martin. Prentice Hall 2000. , 2000 .

[15]  Henry Lieberman,et al.  A model of textual affect sensing using real-world knowledge , 2003, IUI '03.

[16]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[17]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[18]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[19]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[20]  Mike Y. Chen,et al.  Yahoo! For Amazon: Sentiment Parsing from Small Talk on the Web , 2001 .

[21]  Céline Rouveirol,et al.  Machine Learning: ECML-98 , 1998, Lecture Notes in Computer Science.

[22]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[23]  Boris Katz,et al.  From Sentence Processing to Information Access on the World Wide Web , 1997 .

[24]  Min Zhang,et al.  A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval , 2008, SIGIR '08.

[25]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[26]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[27]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[28]  Sharon L. Oviatt,et al.  Multimodal Integration - A Statistical View , 1999, IEEE Trans. Multim..

[29]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[30]  Mirella Lapata,et al.  Discourse Chunking and its Application to Sentence Compression , 2005, HLT.

[31]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[32]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[33]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[34]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[35]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[36]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.