A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the important fields of study in this domain. The main researches in the area of sentiment analysis have focused on English language and few works considered the sentiment analysis in Persian language due to the lack of resources. This paper aims to introduce a supervised method for creating a sentiment dictionary in Persian language with extracting linguistic features in reviews and statistical mutual information to determine the sentiment orientation and sentistrength of words. To evaluate the proposed method, a set of existing reviews in the online retail site is used in various domains and the present dictionary is compared with Sentiwordnet. The results show the proposed method achieves an accuracy of 80% in determining the orientation of sentiment word.

[1]  Gabriel Recchia,et al.  More data trumps smarter algorithms: Comparing pointwise mutual information with latent semantic analysis , 2009, Behavior research methods.

[2]  Alimardani Saeedeh,et al.  OPINION MINING IN PERSIAN LANGUAGE USING SUPERVISED ALGORITHMS , 2015 .

[3]  Khairullah Khan,et al.  Sentiment Classification from Online Customer Reviews Using Lexical Contextual Sentence Structure , 2011, ICSECS.

[4]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[5]  Daniel Dajun Zeng,et al.  Sentiment analysis of Chinese documents: From sentence to document level , 2009, J. Assoc. Inf. Sci. Technol..

[6]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[7]  Qiong Wu,et al.  A random walk algorithm for automatic construction of domain-oriented sentiment lexicon , 2011, Expert Syst. Appl..

[8]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[9]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[10]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[11]  Yogendra Kumar Jain,et al.  Min Max Normalization Based Data Perturbation Method for Privacy Protection , 2011 .

[12]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[13]  Ayoub Bagheri,et al.  Feature Selection Methods in Persian Sentiment Analysis , 2013, NLDB.

[14]  Yong Shi,et al.  The Role of Text Pre-processing in Sentiment Analysis , 2013, ITQM.

[15]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[16]  H. Faili,et al.  A non-parametric LDA-based induction method for sentiment analysis , 2012, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012).

[17]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[18]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) Using WordNet to measure semantic orientations of adjectives , 2004 .

[19]  Luis Alfonso Ureña López,et al.  Ranked WordNet graph for Sentiment Polarity Classification in Twitter , 2014, Comput. Speech Lang..

[20]  Mohammad Ehsan Basiri,et al.  A Framework for Sentiment Analysis in Persian , 2014 .

[21]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[22]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[23]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[24]  Songbo Tan,et al.  Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon , 2010, WSDM '10.

[25]  Renata Vieira,et al.  Construction of a Portuguese Opinion Lexicon from multiple resources , 2011, STIL.

[26]  Sabine Bergler,et al.  Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses , 2006, EACL.

[27]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[28]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[29]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[30]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[31]  Zhendong Niu,et al.  Automatic construction of domain-specific sentiment lexicon based on constrained label propagation , 2014, Knowl. Based Syst..

[32]  Luis Alfonso Ureña López,et al.  Sentiment polarity detection in Spanish reviews combining supervised and unsupervised approaches , 2013, Expert Syst. Appl..

[33]  Michael Gamon,et al.  Automatic Identification of Sentiment Vocabulary: Exploiting Low Association with Known Sentiment Terms , 2005, ACL 2005.

[34]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[35]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[36]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[37]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.