Opinion-Polarity Identification in Bengali

In this paper, opinion polarity classification on news texts has been carried out for a less privileged language Bengali using Support Vector Machine (SVM) 1 . The present system identifies semantic orientation of an opinionated phrase as either positive or negative. The classification of text as either subjective or objective is clearly a precursor to determine the opinion orientation of evaluative text since objective text is not evaluative by definition. A subjectivity classifier has been used to perform sentence level subjectivity classification. The present system is a hybrid approach to the overall opinion polarity identification problem and works with lexicon entities and linguistic syntactic features. The baselines system works only with SentiWordNet (Bengali) 2 . The use of lexical features like negative words, stemming cluster, functional word and parts of speech improved the performance of the present system over baseline. Inclusion of the chunk feature has improved the precision of the system by 19.2%. A further improvement of 3.6% in precision of the system has been obtained with the use of dependency relations information. Evaluation results of the final system have demonstrated a precision of 70.04% and a recall of 63.02%. KeywordsOpinion Mining, Polarity Identification, Bengali and Phrase Level Polarity Identification.

[1]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[2]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[3]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[4]  Sivaji Bandyopadhyay,et al.  Dependency Parser for Bengali: the JU System at ICON 2009 , 2009 .

[5]  Marco Baroni,et al.  Identifying subjective adjectives through web-based mutual information , 2004 .

[6]  Hiroshi Nakagawa,et al.  Understanding Sentiment of People from News Articles: Temporal Sentiment Analysis of Social Events , 2007, ICWSM.

[7]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[8]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[9]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss analysis , 2005, CIKM 2005.

[10]  Vibhu O. Mittal,et al.  A fact/opinion classifier for news articles , 2007, SIGIR.

[11]  Claire Cardie,et al.  Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns , 2005, HLT.

[12]  Galit Avneri,et al.  Style-based Text Categorization: What Newspaper Am I Reading? , 1998 .

[13]  Janyce Wiebe,et al.  Annotating Attributions and Private States , 2005, FCA@ACL.

[14]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[15]  Sivaji Bandyopadhyay,et al.  Subjectivity Detection in English and Bengali: A CRF-based Approach , 2009 .

[16]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[17]  小林 のぞみ Opinion mining from Web documents : extraction and structurization , 2007 .

[18]  Soo-Min Kim,et al.  Automatic Detection of Opinion Bearing Words and Sentences , 2005, IJCNLP.

[19]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .