Investigation in Statistical Language-Independent Approaches for Opinion Detection in English, Chinese and Japanese

In this paper we present a new statistical approach to opinion detection and its' evaluation on the English, Chinese and Japanese corpora. Besides, the proposed method is compared with three baselines, namely Naive Bayes classifier, a language model and an approach based on significant collocations. These models being language independent are improved with the use of language-dependent technique on the example of the English corpus. We show that our method almost always gives better performance compared to the considered baselines.

[1]  Koji Eguchi,et al.  Sentiment Retrieval using Generative Models , 2006, EMNLP.

[2]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[3]  Masaki Murata,et al.  Applying Multiple Characteristics and Techniques to Obtain High Levels of Performance in Information Retrieval at NTCIR-4 , 2002, NTCIR.

[4]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[5]  Shlomo Argamon,et al.  Appraisal Extraction for News Opinion Analysis at NTCIR-6 , 2007, NTCIR.

[6]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[7]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[8]  C. Muller Principes et méthodes de statistique lexicale , 1992 .

[9]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[10]  Jacques Savoy,et al.  Comparative study of monolingual and multilingual search models for use with asian languages , 2005, TALIP.

[11]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[12]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[13]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[14]  Claire Cardie,et al.  OpinionFinder: A System for Subjectivity Analysis , 2005, HLT.

[15]  A. Kilgarriff Comparing Corpora , 2001 .

[16]  Jacques Savoy,et al.  Database merging strategy based on logistic regression , 2000, Inf. Process. Manag..

[17]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[18]  Ian Witten,et al.  Data Mining , 2000 .