Accurate frequency-based lexicon generation for opinion mining

Sentiment analysis deals with classifying the opinions in text. Twitter is the most popular microblogging platform in social media, with hundreds of millions of tweets posted every day. A considerable number of tweets contain opinions. The goal of this paper is to classify the polarity of the tweets into positive and negative classes using dynamic sentiment lexicons based on frequencies of words in positive and negative classes. We extract five meta-level features incorporating the generated sentiment lexicons and classify the text based on them. We also incorporate some previously known lexicon-based and corpus-based features. The proposed method is assessed on six datasets, and outperforms previous papers on accuracy on four datasets, and on f-measure on three datasets. This method generates sentiment lexicons dynamically. The changes of meanings of words can be captured by the generated lexicons. Our research produces very promising results in sentiment analysis in terms of accuracy and f-measure. The accuracy of our method on four datasets and the f-measure of our method on three datasets are higher than 85%.

[1]  Rui Li,et al.  Towards effective browsing of large scale social annotations , 2007, WWW '07.

[2]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[3]  Ming-Syan Chen,et al.  Mining Web informative structures and contents based on entropy analysis , 2004, IEEE Transactions on Knowledge and Data Engineering.

[4]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[5]  Harith Alani,et al.  Semantic Patterns for Sentiment Analysis of Twitter , 2014, SEMWEB.

[6]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[7]  Sabine Bergler,et al.  Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses , 2006, EACL.

[8]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[9]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.

[10]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.

[11]  Claire Cardie,et al.  A Survey on Assessment and Ranking Methodologies for User-Generated Content on the Web , 2015, ACM Comput. Surv..

[12]  Elaheh Momeni,et al.  Leveraging Social Affect for Identifying Individual Mood , 2015, SEMANTiCS.

[13]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[14]  D. Maynard,et al.  Challenges in developing opinion mining tools for social media , 2012 .

[15]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[16]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[17]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[18]  Mário J. Silva,et al.  Adding geographic scopes to web resources , 2006, Comput. Environ. Urban Syst..

[19]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[20]  S. Jusoh,et al.  Applying fuzzy sets for opinion mining , 2013, 2013 International Conference on Computer Applications Technology (ICCAT).

[21]  Craig MacDonald,et al.  An effective statistical approach to blog post opinion retrieval , 2008, CIKM '08.

[22]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[23]  Tao-Jian Lu,et al.  Semi-supervised microblog sentiment analysis using social relation and text similarity , 2015, 2015 International Conference on Big Data and Smart Computing (BIGCOMP).

[24]  Mohammad Saniee Abadeh,et al.  ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs , 2017, Knowl. Based Syst..

[25]  Jason J. Jung,et al.  Sentiment analysis based on fuzzy propagation in online social networks: A case study on TweetScope , 2014, Comput. Sci. Inf. Syst..

[26]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[27]  Kim-Kwang Raymond Choo,et al.  A model for sentiment and emotion analysis of unstructured social media text , 2018, Electron. Commer. Res..

[28]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[29]  K. Indhuja,et al.  Fuzzy logic based sentiment analysis of product review documents , 2014, 2014 First International Conference on Computational Systems and Communications (ICCSC).

[30]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[31]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[32]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[33]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[34]  Mark A. Clements,et al.  Sentiment analysis using Neuro-Fuzzy and Hidden Markov models of text , 2013, 2013 Proceedings of IEEE Southeastcon.

[35]  Diana Inkpen,et al.  Using a Heterogeneous Dataset for Emotion Analysis in Text , 2011, Canadian Conference on AI.

[36]  David Zimbra,et al.  Brand-Related Twitter Sentiment Analysis Using Feature Engineering and the Dynamic Architecture for Artificial Neural Networks , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[37]  Harith Alani,et al.  On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter , 2014, LREC.

[38]  Chelsea Dobbins,et al.  Scalable Daily Human Behavioral Pattern Mining from Multivariate Temporal Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[39]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[40]  Xin Wang,et al.  Chinese Sentence-Level Sentiment Classification Based on Fuzzy Sets , 2010, COLING.

[41]  Lizhen Liu,et al.  Toward a fuzzy domain sentiment ontology tree for sentiment analysis , 2012, 2012 5th International Congress on Image and Signal Processing.

[42]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[43]  Jun Zhao,et al.  Adding Redundant Features for CRFs-based Sentence Sentiment Classification , 2008, EMNLP.

[44]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[45]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[46]  Raymond Y. K. Lau,et al.  Social analytics: Learning fuzzy product ontologies for aspect-oriented sentiment analysis , 2014, Decis. Support Syst..

[47]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[48]  Bruno Pouliquen,et al.  Sentiment Analysis in the News , 2010, LREC.

[49]  Harith Alani,et al.  SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter , 2014, ESWC.

[50]  Paola Velardi,et al.  Time Makes Sense: Event Discovery in Twitter Using Temporal Similarity , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[51]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[52]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[53]  Estevam R. Hruschka,et al.  Tweet sentiment analysis with classifier ensembles , 2014, Decis. Support Syst..

[54]  Harith Alani,et al.  Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold , 2013, ESSEM@AI*IA.

[55]  Chelsea Dobbins,et al.  Lesson Learned from Collecting Quantified Self Information via Mobile and Wearable Devices , 2015, J. Sens. Actuator Networks.

[56]  Yang Shen,et al.  Emotion mining research on micro-blog , 2009, 2009 1st IEEE Symposium on Web Society.

[57]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[58]  Harith Alani,et al.  Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[59]  Paloma Martínez,et al.  Exploring Convolutional Neural Networks for Sentiment Analysis of Spanish tweets , 2017, EACL.

[60]  Alexandre Plastino,et al.  A Statistical and Evolutionary Approach to Sentiment Analysis , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[61]  Mauro Dragoni,et al.  A Fuzzy System for Concept-Level Sentiment Analysis , 2014, SemWebEval@ESWC.

[62]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[63]  Mimmo Parente,et al.  Time Aware Knowledge Extraction for microblog summarization on Twitter , 2015, Inf. Fusion.

[64]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[65]  Masnizah Mohd,et al.  Sentiment Lexicon Interpolation and Polarity Estimation of Objective and Out-Of-Vocabulary Words to Improve Sentiment Classification on Microblogging , 2014, PACLIC.

[66]  Alberto Del Bimbo,et al.  A multimodal feature learning approach for sentiment analysis of social network multimedia , 2016, Multimedia Tools and Applications.

[67]  Avinash Chandra Pandey,et al.  Twitter sentiment analysis using hybrid cuckoo search method , 2017, Inf. Process. Manag..

[68]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[69]  Mimmo Parente,et al.  Towards OLAP Analysis of Multidimensional Tweet Streams , 2015, DOLAP.

[70]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[71]  Harith Alani,et al.  Adapting Sentiment Lexicons Using Contextual Semantics for Sentiment Analysis of Twitter , 2014, ESWC.

[72]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[73]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[74]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[75]  Martin Tomitsch,et al.  Energy-Efficient Integration of Continuous Context Sensing and Prediction into Smartwatches , 2015, Sensors.

[76]  Estevam R. Hruschka,et al.  Combining Classification and Clustering for Tweet Sentiment Analysis , 2014, 2014 Brazilian Conference on Intelligent Systems.

[77]  Mário J. Silva,et al.  Clues for detecting irony in user-generated contents: oh...!! it's "so easy" ;-) , 2009, TSA@CIKM.