SentiWordNet for Indian Languages

The discipline where sentiment/ opinion/ emotion has been identified and classified in human written text is well known as sentiment analysis. A typical computational approach to sentiment analysis starts with prior polarity lexicons where entries are tagged with their prior out of context polarity as human beings perceive using their cognitive knowledge. Till date, all research efforts found in sentiment lexicon literature deal mostly with English texts. In this article, we propose multiple computational techniques like, WordNet based, dictionary based, corpus based or generative approaches for generating SentiWordNet(s) for Indian languages. Currently, SentiWordNet(s) are being developed for three Indian languages: Bengali, Hindi and Telugu. An online intuitive game has been developed to create and validate the developed SentiWordNet(s) by involving Internet population. A number of automatic, semi-automatic and manual validations and evaluation methodologies have been adopted to measure the coverage and credibility of the developed SentiWordNet(s).

[1]  K. Saravanan,et al.  wikiBABEL: community creation of multilingual data , 2008, Int. Sym. Wikis.

[2]  Marilyn A. Walker,et al.  Learning to Generate Naturalistic Utterances Using Reviews in Spoken Dialogue Systems , 2006, ACL.

[3]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[4]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[5]  Sabine Bergler,et al.  CLaC and CLaC-NB: Knowledge-based and corpus-based approaches to sentiment tagging , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[6]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[7]  H. Isahara,et al.  WNMS : Connecting the Distributed WordNet in the Case of Asian WordNet , 2009 .

[8]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[9]  Sivaji Bandyopadhyay,et al.  Subjectivity Detection in English and Bengali: A CRF-based Approach , 2009 .

[10]  Sivaji Bandyopadhyay,et al.  Phrase-level Polarity Identification for Bangla , 2010, Int. J. Comput. Linguistics Appl..

[11]  Michael Gamon,et al.  Customizing Sentiment Classifiers to New Domains: a Case Study , 2019 .

[12]  Graeme Hirst,et al.  Computing Word-Pair Antonymy , 2008, EMNLP.

[13]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[16]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.