Sentiment analysis: Bayesian Ensemble Learning

The huge amount of textual data on the Web has grown in the last few years rapidly creating unique contents of massive dimension. In a decision making context, one of the most relevant tasks is polarity classification of a text source, which is usually performed through supervised learning methods. Most of the existing approaches select the best classification model leading to over-confident decisions that do not take into account the inherent uncertainty of the natural language. In this paper, we pursue the paradigm of ensemble learning to reduce the noise sensitivity related to language ambiguity and therefore to provide a more accurate prediction of polarity. The proposed ensemble method is based on Bayesian Model Averaging, where both uncertainty and reliability of each single model are taken into account. We address the classifier selection problem by proposing a greedy approach that evaluates the contribution of each model with respect to the ensemble. Experimental results on gold standard datasets show that the proposed approach outperforms both traditional classification and ensemble methods. A novel ensemble learning methodology is proposed for polarity classification task.A selection strategy is studied to reduce the search space of candidate ensembles.The proposed model has been shown to be effective and efficient in several domains.

[1]  Ming Xu,et al.  Feature-level sentiment analysis for Chinese product reviews , 2011, 2011 3rd International Conference on Computer Research and Development.

[2]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[3]  Grigorios Tsoumakas,et al.  An ensemble uncertainty aware measure for directed hill climbing ensemble pruning , 2010, Machine Learning.

[4]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[5]  Elisabetta Fersini,et al.  Enhance User-Level Sentiment Analysis on Microblogs with Approval Relations , 2013, AI*IA.

[6]  Gonzalo Martínez-Muñoz,et al.  Pruning in ordered bagging ensembles , 2006, ICML.

[7]  Amit P. Sheth,et al.  Extracting Diverse Sentiment Expressions with Target-Dependent Polarity from Twitter , 2012, ICWSM.

[8]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[9]  Matteo Baldoni From Objects to Agents , 2013 .

[10]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[13]  Wei Peng,et al.  Generate Adjective Sentiment Dictionary for Social Media Sentiment Analysis Using Constrained Nonnegative Matrix Factorization , 2021, ICWSM.

[14]  Yung-Ming Li,et al.  Deriving market intelligence from microblogs , 2013, Decis. Support Syst..

[15]  Mário J. Silva,et al.  The Design of OPTIMISM, an Opinion Mining System for Portuguese Politics , 2009 .

[16]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[17]  Elisabetta Fersini,et al.  Bayesian Model Averaging and Model Selection for Polarity Classification , 2013, NLDB.

[18]  Christopher Joseph Pal,et al.  Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[19]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[20]  Elisabetta Fersini,et al.  Enhance Polarity Classification on Social Media through Sentiment-based Feature Expansion , 2013, WOA@AI*IA.

[21]  Oscar Täckström,et al.  Semi-supervised Latent Variable Models for Fine-grained Sentiment Analysis , 2011 .

[22]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[23]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[24]  Daniel Dajun Zeng,et al.  Twitter Sentiment Analysis: A Bootstrap Ensemble Framework , 2013, 2013 International Conference on Social Computing.

[25]  Delip Rao,et al.  Semi-Supervised Polarity Lexicon Induction , 2009, EACL.

[26]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[27]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[28]  Rosa M. Carro,et al.  Sentiment analysis in Facebook and its application to e-learning , 2014, Comput. Hum. Behav..

[29]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[30]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.