Analyzing the Performance of SVM for Polarity Detection with Different Datasets

Social media and micro-blogging websites have become the popular platforms where anyone can express his/her thoughts about any particular news, event or product etc. The problem of analyzing this massive amount of user-generated data is one of the hot topics today. The term sentiment analysis includes the classification of a particular text as positive, negative or neutral, is known as polarity detection. Support Vector Machine (SVM) is one of the widely used machine learning algorithms for sentiment analysis. In this research, we have proposed a Sentiment Analysis Framework and by using this framework, analyzed the performance of SVM for textual polarity detection. We have used three datasets for experiment, two from twitter and one from IMDB reviews. For performance evaluation of SVM, we have used three different ratios of training data and test data, 70:30, 50:50 and 30:70. Performance is measured in terms of precision, recall and f-measure for each dataset.

[1]  Thiago Pardo,et al.  NILC_USP: A Hybrid System for Sentiment Analysis in Twitter Messages , 2013, *SEMEVAL.

[2]  Azuraliza Abu Bakar,et al.  Comparative Analysis of Data Mining Techniques for Malaysian Rainfall Prediction , 2016 .

[3]  Gurjot Kaur,et al.  E-Mail Spam Detection Using SVM and RBF , 2016 .

[4]  R. Rajasree,et al.  Sentiment analysis in twitter using machine learning techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[5]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[6]  Harith Alani,et al.  Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[7]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[8]  Suman,et al.  Comparative Analysis of Classification Algorithms on Different Datasets using WEKA , 2012 .

[9]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[10]  Charu C. Aggarwal,et al.  Mining Text Data , 2012 .

[11]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[12]  G. Mishne Experiments with Mood Classification in , 2005 .

[13]  Shrikanth S. Narayanan,et al.  SAIL: A hybrid approach to sentiment analysis , 2013, *SEMEVAL.

[14]  小林 のぞみ Opinion mining from Web documents : extraction and structurization , 2007 .

[15]  A. Shoukry,et al.  Preprocessing Egyptian Dialect Tweets for Sentiment Mining , 2012, AMTA.

[16]  Mark Levene,et al.  Combining lexicon and learning based approaches for concept-level sentiment analysis , 2012, WISDOM '12.

[17]  Surinder Singh Khurana,et al.  Comparison of classification techniques for intrusion detection dataset using WEKA , 2014, International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014).

[18]  Joongmin Choi,et al.  FEROM: Feature Extraction and Refinement for Opinion Mining , 2011 .

[19]  Takenobu Tokunaga,et al.  Text Categorization based on Weighted Inverse Document Frequency , 1994 .

[20]  N. Prasath,et al.  Opinion mining and sentiment analysis on a Twitter data stream , 2012, International Conference on Advances in ICT for Emerging Regions (ICTer2012).

[21]  Durga Toshniwal,et al.  Predicting Burn Patient Survivability Using Decision Tree In WEKA Environment , 2009, 2009 IEEE International Advance Computing Conference.

[22]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[23]  Elpiniki I. Papageorgiou,et al.  Comparison of Machine Learning Techniques using the WEKA Environment for Prostate Cancer Therapy Plan , 2011, 2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises.

[24]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[25]  Mayura Kinikar,et al.  Machine Learning Algorithms for Opinion Mining and Sentiment Classification , 2013 .

[26]  Umar Manzoor,et al.  Modeling and Predicting Students' Academic Performance Using Data Mining Techniques , 2016 .

[27]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[28]  Sabrina Tiun,et al.  Comparison of machine learning approaches on Arabic twitter sentiment analysis , 2016 .

[29]  Sara Stymne Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages , 2011, ACL.

[30]  Panagiotis G. Ipeirotis,et al.  Show me the money!: deriving the pricing power of product features by mining consumer reviews , 2007, KDD '07.

[31]  Mohammad Zavvar,et al.  Email Spam Detection Using Combination of Particle Swarm Optimization and Artificial Neural Network and Support Vector Machine , 2016 .

[32]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[33]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[34]  Min-Yen Kan,et al.  Product review summarization from a deeper perspective , 2011, JCDL '11.