Unsupervised lexicon induction for clause-level detection of evaluations

This article proposes clause-level evaluation detection, which is a fine-grained type of opinion mining, and describes an unsupervised lexicon building method for capturing domain-specific knowledge by leveraging the similar polarities of sentiments between adjacent clauses. The lexical entries to be acquired are called polar atoms, the minimum human-understandable syntactic structures that specify the polarity of clauses. As a hint to obtain candidate polar atoms, we use context coherency, the tendency for the same polarity to appear successively in a context. Using the overall density and precision of coherency in the corpus, the statistical estimation picks up appropriate polar atoms from among the candidates, without any manual tuning of the threshold values. The experimental results show that the precision of polarity assignment with the automatically acquired lexicon was 83 per cent on average, and our method is robust for corpora in diverse domains and for the size of the initial lexicon.

[1]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[3]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[4]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[5]  C. Blyth Approximate Binomial Confidence Limits , 1986 .

[6]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[7]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[8]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[9]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[10]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[11]  Xiaojin Zhu,et al.  Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization , 2006 .

[12]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[13]  Jun'ichi Tsujii,et al.  A Hybrid Japanese Parser with Hand-crafted Grammar and Statistics , 2000, COLING.

[14]  Janyce Wiebe,et al.  RECOGNIZING STRONG AND WEAK OPINION CLAUSES , 2006, Comput. Intell..

[15]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[16]  Songbo Tan,et al.  An Iterative Reinforcement Approach for Fine-Grained Opinion Mining , 2009, NAACL.

[17]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[18]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[19]  Marilyn A. Walker,et al.  Learning to Generate Naturalistic Utterances Using Reviews in Spoken Dialogue Systems , 2006, ACL.

[20]  Alex Wright Our sentiments, exactly , 2009, CACM.

[21]  Watanabe Hideo,et al.  Deeper Sentiment Analysis Using Machine Translation Technology , 2004, COLING.

[22]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[23]  T. V. Prabhakar,et al.  Sentence Level Sentiment Analysis in the Presence of Conjuncts Using Linguistic Analysis , 2007, ECIR.

[24]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[25]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[26]  Kamal Nigam,et al.  Towards a Robust Metric of Opinion , 2004 .