A cost-sensitive three-way combination technique for ensemble learning in sentiment classification

Abstract Deep neural networks (DNN) have achieved remarkable results in sentiment classification. Some ensemble methods of DNN models and traditional feature-based models are proposed recently. However, to the best of our knowledge, most of the works use traditional ensemble combination techniques, e.g. voting and stacking, which are designed for weak base classifiers. So far many base classifiers, e.g. DNN, have been able to achieve good results in sentiment classification tasks, so there should be a new ensemble combination technique designed for strong base classifiers. To address this issue, we proposed a cost-sensitive combination technique using sequential three-way decisions (3WD), which is named S3WC. In S3WC, base classifiers are arranged in a linear arrangement, and a gate mechanism is constructed in each step to divide the objects into three groups, i.e., positive region, negative region and boundary region, which respectively correspond to acceptance, rejection and deferment in sequential 3WD. Each object is grouped by minimizing its total cost consisting of misclassification cost and time cost. The objects in boundary region require more information to decrease the misclassification cost, so they are reclassified by the subsequent base classifiers in order to obtain more information, while the time cost increases. In the experiment, we apply S3WC to DNN models and traditional feature-based models on five benchmark datasets, and compare its performance with traditional ensemble combination techniques. The experimental results show that S3WC outperforms any of its base classifiers in terms of classification accuracy, and the total cost of S3WC is lower than that of the existing ensemble combination techniques (e.g. majority-voting, weighted-voting, meta-learning).

[1]  Paolo Rosso,et al.  Cross-domain polarity classification using a knowledge-enhanced meta-classifier , 2015, Knowl. Based Syst..

[2]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[3]  Yiyu Yao,et al.  Constructing shadowed sets and three-way approximations of fuzzy sets , 2017, Inf. Sci..

[4]  Michael Gamon,et al.  Customizing Sentiment Classifiers to New Domains: a Case Study , 2019 .

[5]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Nan Zhang,et al.  Attribute reduction for sequential three-way decisions under dynamic granulation , 2017, Int. J. Approx. Reason..

[8]  Yiyu Yao,et al.  An Outline of a Theory of Three-Way Decisions , 2012, RSCTC.

[9]  Rob Goudey,et al.  Do statistical inferences allowing three alternative decisions give better feedback for environmentally precautionary decision-making? , 2007, Journal of environmental management.

[10]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ming Zhou,et al.  A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[12]  Jong-Seok Lee,et al.  Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews , 2014, Knowl. Based Syst..

[13]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[14]  Jingtao Yao,et al.  Gini objective functions for three-way classifications , 2017, Int. J. Approx. Reason..

[15]  Jun Zhao,et al.  Adding Redundant Features for CRFs-based Sentence Sentiment Classification , 2008, EMNLP.

[16]  Bracha Shapira,et al.  ConSent: Context-based sentiment analysis , 2015, Knowl. Based Syst..

[17]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[18]  Elisabetta Fersini,et al.  Sentiment analysis: Bayesian Ensemble Learning , 2014, Decis. Support Syst..

[19]  Duoqian Miao,et al.  Three-way attribute reducts , 2017, Int. J. Approx. Reason..

[20]  Jingtao Yao,et al.  Modelling Multi-agent Three-way Decisions with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[21]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[22]  Mike Thelwall,et al.  A Study of Information Retrieval Weighting Schemes for Sentiment Analysis , 2010, ACL.

[23]  Yiyu Yao,et al.  Three-way Investment Decisions with Decision-theoretic Rough Sets , 2011, Int. J. Comput. Intell. Syst..

[24]  Taghi M. Khoshgoftaar,et al.  Using Ensemble Learners to Improve Classifier Performance on Tweet Sentiment Data , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Hong Yu,et al.  A Cluster Ensemble Framework Based on Three-Way Decisions , 2013, RSKT.

[27]  Yiyu Yao,et al.  Three-way decision and granular computing , 2018, Int. J. Approx. Reason..

[28]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[29]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[30]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[31]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[32]  Elisabetta Fersini,et al.  Expressive signals in social media languages to improve polarity detection , 2016, Inf. Process. Manag..

[33]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[34]  Ali Selamat,et al.  Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples , 2015, Inf. Sci..

[35]  Yiyu Yao,et al.  Cost-sensitive three-way email spam filtering , 2013, Journal of Intelligent Information Systems.

[36]  Yanghui Rao,et al.  Sentiment topic models for social emotion mining , 2014, Inf. Sci..

[37]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[38]  Guoyin Wang,et al.  A tree-based incremental overlapping clustering method using the three-way decision theory , 2016, Knowl. Based Syst..

[39]  Xiuyi Jia,et al.  Two-Phase Classification Based on Three-Way Decisions , 2013, RSKT.

[40]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[41]  Bing Huang,et al.  Sequential three-way decision and granulation for cost-sensitive face recognition , 2016, Knowl. Based Syst..

[42]  Yiyu Yao,et al.  Three-way decisions with probabilistic rough sets , 2010, Inf. Sci..

[43]  Han Zhao,et al.  Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[44]  J. Fernando Sánchez-Rada,et al.  Enhancing deep learning sentiment analysis with ensemble techniques in social applications , 2020 .

[45]  Xiaodong Yue,et al.  Tri-partition neighborhood covering reduction for robust classification , 2017, Int. J. Approx. Reason..

[46]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[47]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[48]  Soo-Min Kim,et al.  Automatic Detection of Opinion Bearing Words and Sentences , 2005, IJCNLP.

[49]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[50]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[51]  Bing Huang,et al.  Cost-sensitive sequential three-way decision modeling using a deep neural network , 2017, Int. J. Approx. Reason..

[52]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[53]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[54]  Fan Min,et al.  Three-way recommender systems based on random forests , 2016, Knowl. Based Syst..

[55]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[56]  Christopher Meek,et al.  Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[57]  Lior Rokach,et al.  Ensemble Methods for Classifiers , 2005, The Data Mining and Knowledge Discovery Handbook.

[58]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[59]  Tianrui Li,et al.  THREE-WAY GOVERNMENT DECISION ANALYSIS WITH DECISION-THEORETIC ROUGH SETS , 2012 .

[60]  Nouman Azam,et al.  Web-Based Medical Decision Support Systems for Three-Way Medical Decision Making With Game-Theoretic Rough Sets , 2015, IEEE Transactions on Fuzzy Systems.

[61]  Yiyu Yao,et al.  Advances in three-way decisions and granular computing , 2016, Knowl. Based Syst..

[62]  Charles Song,et al.  SOPS: Stock Prediction Using Web Sentiment , 2007 .

[63]  Rui Xia,et al.  A POS-based Ensemble Model for Cross-domain Sentiment Classification , 2011, IJCNLP.

[64]  ChengXiang Zhai,et al.  Generating comparative summaries of contradictory opinions in text , 2009, CIKM.

[65]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[66]  Raymond J. Mooney,et al.  Experiments on Ensembles with Missing and Noisy Data , 2004, Multiple Classifier Systems.

[67]  Lei Wang,et al.  Multi‐label emotion recognition of weblog sentence based on Bayesian networks , 2016 .

[68]  Yiyu Yao,et al.  Sequential three-way decisions with probabilistic rough sets , 2011, IEEE 10th International Conference on Cognitive Informatics and Cognitive Computing (ICCI-CC'11).

[69]  Claire Cardie,et al.  Opinion Mining with Deep Recurrent Neural Networks , 2014, EMNLP.

[70]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[71]  Dong-Hong Ji,et al.  A topic-enhanced word embedding for Twitter sentiment classification , 2016, Inf. Sci..