A Novel Ensemble Approach for Feature Selection to Improve and Simplify the Sentimental Analysis

Text Classification is a renowned machine learning approach to simplify the domain-specific investigation. Consequently, it is frequently utilized in the field of sentimental analysis. The demanding business requirements urge to devise new techniques and approaches to improve the performance of sentimental analysis. In this context, ensemble of classifiers is one of the promising approach to improve classification accuracy. However, classifier ensemble is usually done for classification while ignoring the significance of feature selection. In the presence of right feature selection methodology, the classification accuracy can be significantly improved even when the classification is performed through a single classifier. This article presents a novel feature selection ensemble approach for sentimental classification. Firstly, the combination of three well-known features (i.e. lexicon, phrases and unigram) is introduced. Secondly, two level ensemble is proposed for feature selection by exploiting Gini Index (GI), Information Gain (IG), Support Vector Machine (SVM) and Logistic Regression (LR). Subsequently, the classification is performed through SVM classifier. The implementation of proposed approach is carried out in GATE and RapidMiner tools. Furthermore, two benchmark datasets, frequently utilized in the domain of sentimental classification, are used for experimental evaluation. The experimental results prove that our proposed ensemble approach significantly improve the performance of sentimental classification with respect to well-known state-of-the-art approaches. Furthermore, it is also analyzed that the ensemble of classifiers for the improvement of classification accuracy is not necessarily important in the presence of right feature selection methodology.

[1]  Hsinchun Chen,et al.  Affect Analysis of Web Forums and Blogs Using Correlation Ensembles , 2008, IEEE Transactions on Knowledge and Data Engineering.

[2]  Wenjia Wang Heterogeneous Bayesian ensembles for classifying spam emails , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[3]  Taghi M. Khoshgoftaar,et al.  Using Ensemble Learners to Improve Classifier Performance on Tweet Sentiment Data , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[4]  Daniel Dajun Zeng,et al.  Twitter Sentiment Analysis: A Bootstrap Ensemble Framework , 2013, 2013 International Conference on Social Computing.

[5]  Mohamed Abdel Fattah,et al.  New term weighting schemes with combination of multiple classifiers for sentiment analysis , 2015, Neurocomputing.

[6]  Yong Qi,et al.  Information Processing and Management , 1984 .

[7]  Sonajharia Minz,et al.  Multi-view Ensemble Learning for Poem Data Classification Using SentiWordNet , 2014 .

[8]  Jeremy Ellman,et al.  Simple Approaches of Sentiment Analysis via Ensemble Learning , 2015 .

[9]  R. M. Chandrasekaran,et al.  A comparative performance evaluation of neural network based approach for sentiment classification of online reviews , 2016, J. King Saud Univ. Comput. Inf. Sci..

[10]  Lin Dai,et al.  Improving Sentiment Classification Using Feature Highlighting and Feature Bagging , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[11]  R. M. Chandrasekaran,et al.  A sampling based sentiment mining approach for e-commerce applications , 2017, Inf. Process. Manag..

[12]  R. M. Chandrasekaran,et al.  Sentiment Mining Using SVM-Based Hybrid Classification Model , 2014 .

[13]  Namita Mittal,et al.  Concept-Level Sentiment Analysis with Dependency-Based Semantic Parsing: A Novel Approach , 2015, Cognitive Computation.

[14]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[15]  Aoying Zhou,et al.  Assembling the Optimal Sentiment Classifiers , 2012, WISE.

[16]  Estevam R. Hruschka,et al.  Tweet sentiment analysis with classifier ensembles , 2014, Decis. Support Syst..

[17]  Michael A. King,et al.  Ensemble learning methods for pay-per-click campaign management , 2015, Expert Syst. Appl..

[18]  Kangshun Li,et al.  Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction , 2017, Soft Comput..

[19]  Ying Su,et al.  Ensemble Learning for Sentiment Classification , 2012, CLSW.

[20]  Yanghui Rao,et al.  Sentiment and emotion classification over noisy labels , 2016, Knowl. Based Syst..

[21]  Larry S. Yaeger,et al.  Sentiment Mining Using Ensemble Classification Models , 2008, SCSS.

[22]  Qigang Gao,et al.  An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysis , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[23]  Elisabetta Fersini,et al.  Sentiment analysis: Bayesian Ensemble Learning , 2014, Decis. Support Syst..

[24]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[25]  Aytug Onan,et al.  A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification , 2016, Expert Syst. Appl..

[26]  Hsinchun Chen,et al.  A Lexicon-Enhanced Method for Sentiment Classification: An Experiment on Online Product Reviews , 2010, IEEE Intelligent Systems.

[27]  Cagatay CATAL,et al.  A sentiment classification model based on multiple classifiers , 2017, Appl. Soft Comput..

[28]  Muhammad Latif,et al.  Exploring the Ensemble of Classifiers for Sentimental Analysis: A Systematic Literature Review , 2017, ICMLC.

[29]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[30]  Guodong Zhou,et al.  Imbalanced Sentiment Classification with Multi-strategy Ensemble Learning , 2011, 2011 International Conference on Asian Language Processing.

[31]  Matthias Hagen,et al.  Twitter Sentiment Detection via Ensemble Classification Using Averaged Confidence Scores , 2015, ECIR.

[32]  Asif Ekbal,et al.  Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition , 2013, Data Knowl. Eng..

[33]  Benjamin Ka-Yin T'sou,et al.  Combining a large sentiment lexicon and machine learning for subjectivity classification , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[34]  Tiago A. Almeida,et al.  Short text opinion detection using ensemble of classifiers and semantic indexing , 2016, Expert Syst. Appl..

[35]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[36]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[37]  Asif Ekbal,et al.  Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition , 2012, Soft Computing.