Exerting 2D-Space of Sentiment Lexicons with Machine Learning Techniques: A Hybrid Approach for Sentiment Analysis

Sentiment mining from the textual content on the web can give valuable insights for discernment, strategic decision making, targeted advertisement, and much more. Supervised machine learning (ML) approaches do not capture the sentiment inherent in the individual terms. Whereas the unsupervised sen-timent lexicon (SL) based approaches lag behind ML approaches because of a bias they have towards one sentiment than the other. In this paper, we propose a hybrid approach that uses unsuper-vised sentiment lexicons to transform the term space into a two-dimensional sentiment space on which a discriminative classifier is trained in a supervised fashion. This hybrid approach yields higher accuracy, faster training, and lower memory footprint than the ML approaches. It is more suitable for scenarios where training data is scarce. We support our claim by reporting results on six social media datasets using five sentiment lexicons and four ML algorithms.

[1]  TaboadaMaite,et al.  Lexicon-based methods for sentiment analysis , 2011 .

[2]  Dilin Liu,et al.  The appeal to political sentiment: An analysis of Donald Trump’s and Hillary Clinton’s speech themes and discourse strategies in the 2016 US presidential election , 2018, Discourse, Context & Media.

[3]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[4]  P. Deepa Shenoy,et al.  Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier , 2016, World Wide Web.

[5]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[6]  Hee Yong Youn,et al.  A novel classification approach based on Naïve Bayes for Twitter sentiment analysis , 2017, KSII Trans. Internet Inf. Syst..

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  J. Maindonald Statistical Learning from a Regression Perspective , 2008 .

[9]  Yaxin Bi,et al.  Improved lexicon-based sentiment analysis for social media analytics , 2015, Security Informatics.

[10]  Rosa M. Carro,et al.  Sentiment analysis in Facebook and its application to e-learning , 2014, Comput. Hum. Behav..

[11]  E. Athanasopoulou,et al.  Logitboost of Multinomial Bayesian Classifier for Text Classification , 2006 .

[12]  Laizhong Cui,et al.  Weakly supervised topic sentiment joint model with word embeddings , 2018, Knowl. Based Syst..

[13]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[14]  Mark Levene,et al.  Combining lexicon and learning based approaches for concept-level sentiment analysis , 2012, WISDOM '12.

[15]  Bo Jiang,et al.  Machine Learning and Lexicon Based Methods for Sentiment Classification: A Survey , 2014, 2014 11th Web Information System and Application Conference.

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[17]  Shakeel Ahmad,et al.  T‐SAF: Twitter sentiment analysis framework using a hybrid classification scheme , 2018, Expert Syst. J. Knowl. Eng..

[18]  Wei Wang,et al.  Ranking product aspects through sentiment analysis of online reviews , 2017, J. Exp. Theor. Artif. Intell..

[19]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[20]  Christopher M. Danforth,et al.  Measuring the Happiness of Large-Scale Written Expression: Songs, Blogs, and Presidents , 2010, ArXiv.

[21]  Athena Vakali,et al.  Sentiment analysis leveraging emotions and word embeddings , 2017 .

[22]  David Jacot,et al.  Sentiment Analysis of French Movie Reviews , 2011, Advances in Distributed Agent-Based Retrieval Tools.

[23]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[24]  Hamido Fujita,et al.  Successes and challenges in developing a hybrid approach to sentiment analysis , 2017, Applied Intelligence.

[25]  Yang Liu,et al.  A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm , 2017, Inf. Sci..

[26]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[27]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[28]  Jacob Eisenstein,et al.  Unsupervised Learning for Lexicon-Based Classification , 2016, AAAI.

[29]  T. Fearn,et al.  Classification and Regression Trees (CART) , 2020, Statistical Learning from a Regression Perspective.

[30]  Erik Cambria,et al.  SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings , 2018, AAAI.

[31]  Saif Mohammad,et al.  From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales , 2011, LaTeCH@ACL.

[32]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[33]  Deebha Mumtaz,et al.  A Lexical and Machine Learning-Based Hybrid System for Sentiment Analysis , 2018 .

[34]  Dong-Hong Ji,et al.  Towards Twitter sentiment classification by multi-level sentiment-enriched word embeddings , 2016, Neurocomputing.

[35]  Fabrício Benevenuto,et al.  iFeel: a system that compares and combines sentiment analysis methods , 2014, WWW.

[36]  Mohamed M. Mostafa,et al.  More than words: Social networks' text mining for consumer brand sentiments , 2013, Expert Syst. Appl..

[37]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[38]  Celeste Biever Twitter mood maps reveal emotional states of America , 2010 .

[39]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[40]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[41]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[42]  Shu-Tao Xia,et al.  A probabilistic model for semantic advertising , 2018, Knowledge and Information Systems.

[43]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[44]  Luis Alfonso Ureña López,et al.  Sentiment polarity detection in Spanish reviews combining supervised and unsupervised approaches , 2013, Expert Syst. Appl..

[45]  Mitsuru Ishizuka,et al.  Recognition of Affect, Judgment, and Appreciation in Text , 2010, COLING.

[46]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.