Sentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier

The advent of Web 2.0 has led to an increase in the amount of sentimental content available in the Web. Such content is often found in social media web sites in the form of movie or product reviews, user comments, testimonials, messages in discussion forums etc. Timely discovery of the sentimental or opinionated web content has a number of advantages, the most important of all being monetization. Understanding of the sentiments of human masses towards different entities and products enables better services for contextual advertisements, recommendation systems and analysis of market trends. The focus of our project is sentiment focussed web crawling framework to facilitate the quick discovery of sentimental contents of movie reviews and hotel reviews and analysis of the same. We use statistical methods to capture elements of subjective style and the sentence polarity. The paper elaborately discusses two supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naive Bayes and compares their overall accuracy, precisions as well as recall values. It was seen that in case of movie reviews Naive Bayes gave far better results than K-NN but for hotel reviews these algorithms gave lesser, almost same accuracies.

[1]  陶建斌,et al.  Naive Bayesian Classifier在遥感影像分类中的应用研究 , 2009 .

[2]  Suad Alhojely,et al.  Sentiment Analysis and Opinion Mining: A Survey , 2016 .

[3]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[4]  João Gama,et al.  MARKETING RESEARCH: THE ROLE OF SENTIMENT ANALYSIS , 2013 .

[5]  Amlan Chakrabarti,et al.  Feature Selection: A Practitioner View , 2014 .

[6]  Sanjay Chakraborty,et al.  Canonical PSO Based K-Means Clustering Approach for Real Datasets , 2014, International scholarly research notices.

[7]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[8]  Shixiong Xia,et al.  An Improved KNN Text Classification Algorithm Based on Clustering , 2009, J. Comput..

[9]  Eniafe Festus Ayetiran,et al.  A Data Mining-Based Response Model for Target Selection in Direct Marketing , 2012 .

[10]  Yi Yang,et al.  An improved KNN text classification algorithm based on Simhash , 2017, 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[11]  Sanjay Chakraborty,et al.  Convex-hull & DBSCAN clustering to predict future weather , 2015, 2015 International Conference and Workshop on Computing and Communication (IEMCON).

[12]  Weiguo Fan,et al.  Tapping the power of text mining , 2006, CACM.

[13]  K. L. Shunmuganathan,et al.  SENTIMENT CLASSIFICATION OF MOVIE REVIEWS BY SUPERVISED MACHINE LEARNING , 2013 .

[14]  Shaidah Jusoh,et al.  Techniques , Applications and Challenging Issue in Text Mining , 2012 .

[15]  Girish K. Patnaik,et al.  Analyzing Sentiment of Movie Review Data using Naive Bayes Neural Classifier , 2014 .

[16]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..