Most Persistent Feature Selection Method for Opinion Mining of Social Media Reviews

Many business organizations use social media data in order to understand their customer on an individual level. Consumers are keen to share their views on certain products or commodities. This leads to the generation of large amount of unstructured social media data. Thus, text data is being formed gradually in many areas like automated business, education, health care, show business. Opinion mining, the subfield of text mining, deals with mining of review text and classifying the opinions or the sentiments of that text as positive or negative. The work in this paper develops a framework for opinion mining. It includes a novel feature selection method called Most Persistent Feature Selection (MPFS). MPFS method uses information gain of the features in the review documents. The performance of the three different classifiers, namely Naive Bayes, Maximum Entropy, and Support Vector Machine, with the proposed feature selection method is evaluated on movie reviews using the parameters accuracy, precision, recall, and F-score. The different classifier models generated show the acceptable performance in comparison with the other existing models.

[1]  P. Deepa Shenoy,et al.  Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier , 2016, World Wide Web.

[2]  Santanu Kumar Rath,et al.  Classification of sentiment reviews using n-gram machine learning approach , 2016, Expert Syst. Appl..

[3]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[4]  Xiaojun Wan,et al.  CLOpinionMiner: Opinion Target Extraction in a Cross-Language Scenario , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Rashid Ali,et al.  Feature extraction and analysis of online reviews for the recommendation of books using opinion mining technique , 2016 .

[6]  Liu Yongbin,et al.  Opinion Objects Identification and Sentiment Analysis , 2015 .

[7]  Xin Wang,et al.  Chinese Sentence-Level Sentiment Classification Based on Fuzzy Sets , 2010, COLING.

[8]  Ali Selamat,et al.  Sentiment analysis using Support Vector Machine , 2014, 2014 International Conference on Computer, Communications, and Control Technology (I4CT).

[9]  Yaxin Bi,et al.  Improved lexicon-based sentiment analysis for social media analytics , 2015, Security Informatics.

[10]  Samir Tartir,et al.  Semantic Sentiment Analysis in Arabic Social Media , 2017, J. King Saud Univ. Comput. Inf. Sci..

[11]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[12]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[13]  Justin Zhijun Zhan,et al.  Sentiment analysis using product review data , 2015, Journal of Big Data.

[14]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[15]  Xin Wang,et al.  Chinese Sentence-Level Sentiment Classification Based on Sentiment Morphemes , 2010, 2010 International Conference on Asian Language Processing.

[16]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[17]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[18]  Santanu Kumar Rath,et al.  Document-level sentiment classification using hybrid machine learning approach , 2017, Knowledge and Information Systems.