Sentiment Classification on Movie Reviews and Twitter: An Experimental Study of Supervised Learning Models

Sentiment classification refers to the act of putting in for natural language processing and text mining strategies to distinguish subjective textual data. Due to the huge availability of online data that coincide with the growth of social media, there has been a big interest from researchers in sentiment analysis and its applications. In this paper, we review the state of the art to determine how the previous researches have addressed this task. we also introduce an empirical study on two annotated datasets; 25,000 IMDB movie reviews and 25,000 tweets, where we used nine supervised learning models, the next step was to implement a voting ensemble classifier using the top four models we get from the previous steps. In the end, we outline a benchmark evaluation, the results show that the ensemble classifier outperforms all the machine learning models.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Vishal. A. Kharde,et al.  Sentiment Analysis of Twitter Data : A Survey of Techniques , 2016, ArXiv.

[3]  Peng Yan,et al.  MapReduce and Semantics Enabled Event Detection using Social Media , 2017, J. Artif. Intell. Soft Comput. Res..

[4]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Philip Treleaven,et al.  Twitter Sentiment Analysis , 2015, ArXiv.

[6]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[7]  El Habib Benlahmar,et al.  Using Deep Learning Word Embeddings for Citations Similarity in Academic Papers , 2018, BDCA.

[8]  Elisabetta Fersini,et al.  Sentiment analysis: Bayesian Ensemble Learning , 2014, Decis. Support Syst..

[9]  Estevam R. Hruschka,et al.  Tweet sentiment analysis with classifier ensembles , 2014, Decis. Support Syst..

[10]  Geetika Gautam,et al.  Sentiment analysis of twitter data using machine learning approaches and semantic analysis , 2014, 2014 Seventh International Conference on Contemporary Computing (IC3).

[11]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[12]  P. Deepa Shenoy,et al.  Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier , 2016, World Wide Web.

[13]  Sanjay Chakraborty,et al.  Sentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier , 2016, International Journal of Information Engineering and Electronic Business.

[14]  Dekang Lin,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1 , 2011 .

[15]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[16]  Jimmy J. Lin,et al.  Large-scale machine learning at twitter , 2012, SIGMOD Conference.

[17]  Qun Dai,et al.  A competitive ensemble pruning approach based on cross-validation technique , 2013, Knowl. Based Syst..

[18]  Erik Cambria,et al.  SeNTU: Sentiment Analysis of Tweets by Combining a Rule-based Classifier with Supervised Learning , 2015, *SEMEVAL.

[19]  Sam Clark,et al.  SwatCS: Combining simple classifiers with estimated accuracy , 2013, *SEMEVAL.

[20]  Andreas Dengel,et al.  Sentiment Analysis Using Sentiment Features , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).