Ensemble Learning Sentiment Classification for Un-labeled Arabic Text

Sentiment classification has become one of the most trending research topics, due to the rapid growth of social media platforms and applications. It is the process of determining the opinion or the feeling of a piece of text and assigning a label to it (positive, negative or neutral). One of the issues in sentiment classification is the need for labeled data – that is often carried out manually - in order to train the classifiers which is a time consuming task. In this paper we consider the lexicon-based classification as labeling technique instead of the manual labeling. In addition, for an effective sentiment classification we investigate the using of multiple ensemble learning methods - where multiple classifiers are combined - in order to improve the performance of the classification. Experiments have been run on datasets of reviews written in Modern Standard Arabic. Results show that the labeling technique is effective and promising and the use of ensemble learning has clearly improved the accuracy for the sentiment classification compared to the traditional methods.

[1]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[2]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[3]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[4]  Cha Zhang,et al.  Ensemble Machine Learning , 2012 .

[5]  Ying Su,et al.  Ensemble Learning for Sentiment Classification , 2012, CLSW.

[6]  Samhaa R. El-Beltagy,et al.  Building Large Arabic Multi-domain Resources for Sentiment Analysis , 2015, CICLing.

[7]  Lior Rokach,et al.  Improving malware detection by applying multi-inducer ensemble , 2009, Comput. Stat. Data Anal..

[8]  Verónica Bolón-Canedo,et al.  Ensemble feature selection: Homogeneous and heterogeneous approaches , 2017, Knowl. Based Syst..

[9]  Yan-Shi Dong,et al.  A comparison of several ensemble methods for text categorization , 2004, IEEE International Conference onServices Computing, 2004. (SCC 2004). Proceedings. 2004.

[10]  Gang Liu,et al.  An Ensemble Framework of Voice-Based Emotion Recognition System for Films and TV Programs , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Jianping Yin,et al.  Malicious Codes Detection Based on Ensemble Learning , 2007, ATC.

[12]  Li Deng,et al.  Ensemble deep learning for speech recognition , 2014, INTERSPEECH.

[13]  Padmini Srinivasan,et al.  Exploring Feature Definition and Selection for Sentiment Classifiers , 2011, ICWSM.

[14]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[15]  Ahmed Emam,et al.  Arabic Sentiment Analysis: A Survey , 2015 .

[16]  Osama Abulnaja,et al.  Semantic Sentiment Analysis of Arabic Texts , 2017 .

[17]  Cha Zhang,et al.  Ensemble Machine Learning: Methods and Applications , 2012 .