A holistic lexicon-based approach to opinion mining

One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. This problem has many applications, e.g., opinion mining, summarization and search. Most existing techniques utilize a list of opinion (bearing) words (also called opinion lexicon) for the purpose. Opinion words are words that express desirable (e.g., great, amazing, etc.) or undesirable (e.g., bad, poor, etc) states. These approaches, however, all have some major shortcomings. In this paper, we propose a holistic lexicon-based approach to solving the problem by exploiting external evidences and linguistic conventions of natural language expressions. This approach allows the system to handle opinion words that are context dependent, which cause major difficulties for existing algorithms. It also deals with many special words, phrases and language constructs which have impacts on opinions based on their linguistic patterns. It also has an effective function for aggregating multiple conflicting opinion words in a sentence. A system, called Opinion Observer, based on the proposed technique has been implemented. Experimental results using a benchmark product review data set and some additional reviews show that the proposed technique is highly effective. It outperforms existing methods significantly

[1]  Bing Liu,et al.  The utility of linguistic rules in opinion mining , 2007, SIGIR.

[2]  Trevor Hastie,et al.  An exploration of sentiment summarization , 2003 .

[3]  Eric K. Ringger,et al.  Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[4]  Soo-Min Kim,et al.  Automatic Identification of Pro and Con Reasons in Online Reviews , 2006, ACL.

[5]  Andrea Esuli,et al.  Determining Term Subjectivity and Term Orientation for Opinion Mining , 2006, EACL.

[6]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[7]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[8]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[9]  Masaru Kitsuregawa,et al.  Automatic Construction of Polarity-Tagged Corpus from HTML Documents , 2006, ACL.

[10]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[11]  Claire Cardie,et al.  Toward Opinion Summarization: Linking the Sources , 2006 .

[12]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[13]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[14]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[15]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[16]  Bing Liu,et al.  Mining Comparative Sentences and Relations , 2006, AAAI.

[17]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[18]  Giuseppe Carenini,et al.  Interactive multimedia summaries of evaluative text , 2006, IUI '06.

[19]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[20]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[21]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[22]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[23]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[24]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[25]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[26]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[27]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[28]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[29]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[30]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[31]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[32]  Sabine Bergler,et al.  Semantic Tag Extraction from WordNet Glosses , 2006, LREC.

[33]  Marti A. Hearst Direction-based text interpretation as an information access refinement , 1992 .