Theme detection an exploration of opinion subjectivity

Work in opinion mining and classification often assumes the incoming documents to be opinionated. Opinion mining system makes false hits while attempting to compute polarity values for non-subjective or factual sentences or documents. It becomes imperative to decide whether a given document contains subjective information or not as well as to identify which portions of the document are subjective or factual. In this work a Theme Detection technique has been evolved for more generic domain independent subjectivity detection that classifies sentences with binary feature: opinionated or non-opinionated. Theme Detection technique examines sentence level opinion and finally accumulates the opinion clues to reach the discourse level subjectivity. The subjectivity detection system has been evaluated on the Multi Perspective Question Answering (MPQA) corpus as well as on Bengali corpus. The system evaluation has shown the precision and recall values of 76.08 and 83.33 for English and 72.16 and 76.00 for Bengali respectively.

[1]  Yuji Matsumoto,et al.  Opinion mining from web documents: extraction and structurization (論文特集:データマイニングと統計数理) , 2007 .

[2]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[3]  Soo-Min Kim,et al.  Automatic Detection of Opinion Bearing Words and Sentences , 2005, IJCNLP.

[4]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[5]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss analysis , 2005, CIKM 2005.

[6]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[7]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[8]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[9]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[10]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[11]  Galit Avneri,et al.  Style-based Text Categorization: What Newspaper Am I Reading? , 1998 .

[12]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[13]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[14]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[15]  Claire Cardie,et al.  Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns , 2005, HLT.

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Hiroshi Nakagawa,et al.  Understanding Sentiment of People from News Articles: Temporal Sentiment Analysis of Social Events , 2007, ICWSM.

[18]  Marco Baroni,et al.  Identifying subjective adjectives through web-based mutual information , 2004 .

[19]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[20]  Sivaji Bandyopadhyay,et al.  A web-based Bengali news corpus for named entity recognition , 2008, Lang. Resour. Evaluation.

[21]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[22]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[23]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[24]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[25]  Ellen Riloff,et al.  Exploiting Subjectivity Classification to Improve Information Extraction , 2005, AAAI.

[26]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[27]  Push Singh,et al.  Common Sense Conversations: Understanding Casual Conversation using a Common Sense Database , 2003 .

[28]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[29]  Vibhu O. Mittal,et al.  A fact/opinion classifier for news articles , 2007, SIGIR.

[30]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[31]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.