Evaluating the Impact of Feature Selection on Overall Performance of Sentiment Analysis

Now a days the importance of analyzing the hidden sentiments from user reviews playing a prominent role towards increasing profitability in any organization. To address the challenges being faced in analyzing the text information and transforming the same in to polarities values with an objective of saving time in understanding the public opinion on particular product or service. Traditionally, there are different approaches carried out in transforming text data in to values based on different features of Text. In our research we make use of Stanford CoreNLP, Alias-i's Lingpipe (uses Logistic regression for document classification), Senti WordNet and synthesize libraries from different sources to include several other techniques that are used for text mining to evaluate the impact of feature selection on overall sentiment analysis by scoring a sentences in a review using different scoring Techniques. we also included NTU Lib Linear to make use of linear SVM for document classification. The Features considered on our experiments are Term Frequency and N-Gram (1Gram & 2Gram) with Decision Tree as Prediction model to evaluate the Accuracy, Area under ROC Curve and Kappa value. Finally, Compared the polarities of the reviews obtained using three different sentiment scoring approaches. The findings in our research is, Term Frequency have good impact of (0.932) on classifying the sentiment, In contrast, 2Gram have an impact of (0.8505).

[1]  Xiaoli Zhang,et al.  Context-Specific grounding of web natural descriptions to human-centered situations , 2016, Knowl. Based Syst..

[2]  RigauGerman,et al.  Big data for Natural Language Processing , 2015 .

[3]  Sara Tonelli,et al.  ALCIDE: Extracting and visualising content from large document collections to support humanities studies , 2016, Knowl. Based Syst..

[4]  Mikko Kurimo,et al.  FinnPos: an open-source morphological tagging and lemmatization toolkit for Finnish , 2015, Language Resources and Evaluation.

[5]  Usman Qamar,et al.  SentiMI: Introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection , 2016, Appl. Soft Comput..

[6]  Elena Lloret,et al.  A novel concept-level approach for ultra-concise opinion summarization , 2015, Expert Syst. Appl..

[7]  Dharmendra Singh Rajput,et al.  Impact of Gradient Ascent and Boosting Algorithm in Classification , 2018 .

[8]  Korra Sathya Babu,et al.  Sentiment analysis using Telugu SentiWordNet , 2017, 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).

[9]  Velappa Ganapathy,et al.  Comparison of Genre based Tamil Songs Classification using term Frequency and Inverse Document Frequency , 2017 .

[10]  Le-Minh Nguyen,et al.  Linguistic Features and Learning to Rank Methods for Shopping Advice , 2016, IUKM.

[11]  Julien Velcin,et al.  Sentiment analysis on social media for stock movement prediction , 2015, Expert Syst. Appl..

[12]  Mona T. Diab,et al.  The Power of Language Music: Arabic Lemmatization through Patterns , 2016, CogALex@COLING.

[13]  Alessandro Moschitti,et al.  Multi-lingual opinion mining on YouTube , 2016, Inf. Process. Manag..

[14]  Giovanni Moretti,et al.  Italy goes to Stanford: a collection of CoreNLP modules for Italian , 2016, ArXiv.

[15]  Shuhei Aoki,et al.  Zipf's Law, Pareto's Law, and the Evolution of Top Incomes in the United States† , 2017 .

[16]  Josef Steinberger,et al.  Reprint of "Supervised sentiment analysis in Czech social media" , 2015, Inf. Process. Manag..

[17]  Ronnie D. Caytiles,et al.  Comparative Study on Performance Analysis of Time Series Predictive Models , 2017 .

[18]  Anselmo Peñas,et al.  On improving parsing with automatically acquired semantic classes , 2015, Knowl. Based Syst..

[19]  Ronnie D. Caytiles,et al.  Weighted Fuzzy Rule Based Sentiment Prediction Analysis on Tweets , 2017 .

[20]  Chunyun Zhang,et al.  Mining activation force defined dependency patterns for relation extraction , 2015, Knowl. Based Syst..

[21]  Chunyun Zhang,et al.  Construction of semantic bootstrapping models for relation extraction , 2015, Knowl. Based Syst..

[22]  Viviana Patti,et al.  Ontology-based affective models to organize artworks in the social semantic web , 2016, Inf. Process. Manag..

[23]  Geeta Sikka,et al.  Opinion mining of news headlines using SentiWordNet , 2016, 2016 Symposium on Colossal Data Analysis and Networking (CDAN).

[24]  M. Wooster,et al.  Major advances in geostationary fire radiative power (FRP) retrieval over Asia and Australia stemming from use of Himarawi-8 AHI , 2017 .

[25]  Diego Reforgiato Recupero,et al.  Merging open knowledge extracted from text with MERGILO , 2016, Knowl. Based Syst..

[26]  James P. Bagrow,et al.  Zipf's law is a consequence of coherent language production , 2016, 1601.07969.

[27]  Dipti Misra Sharma,et al.  Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text , 2016, NAACL.

[28]  Estela Saquete Boró,et al.  Cross-document event ordering through temporal, lexical and distributional knowledge , 2016, Knowl. Based Syst..

[29]  Joeran Beel,et al.  Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references? , 2017 .

[30]  Paolo Torroni,et al.  MARGOT: A web server for argumentation mining , 2016, Expert Syst. Appl..

[31]  Goutam Chakraborty,et al.  Feature-based Sentiment Analysis on Android App Reviews Using SAS® Text Miner and SAS® Sentiment Analysis Studio , 2013 .

[32]  Carlo Strapparava,et al.  Why do urban legends go viral? , 2016, Inf. Process. Manag..

[33]  Nizar Habash,et al.  Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages , 2017, Prague Bull. Math. Linguistics.

[34]  Rafael Dueire Lins,et al.  Assessing sentence similarity through lexical, syntactic and semantic analysis , 2016, Comput. Speech Lang..

[35]  Carlos Angel Iglesias Fernandez,et al.  Onyx: A Linked Data approach to emotion representation , 2015 .

[36]  Alexander Schmitt,et al.  A Comparative Study of Text Preprocessing Techniques for Natural Language Call Routing , 2016, IWSDS.

[37]  Francisco Javier González-Castaño,et al.  Unsupervised method for sentiment analysis in online texts , 2016, Expert Syst. Appl..

[38]  Raymond Chiong,et al.  A multilingual semi-supervised approach in deriving Singlish sentic patterns for polarity detection , 2016, Knowl. Based Syst..

[39]  Xabier Artola,et al.  Big data for Natural Language Processing: A streaming approach , 2015, Knowl. Based Syst..

[40]  Felipe Bravo-Marquez,et al.  Building a Twitter opinion lexicon from automatically-annotated tweets , 2016, Knowl. Based Syst..

[41]  H. Balaji,et al.  A Soft Computing Approach to Provide Recommendation on PIMA Diabetes , 2017 .

[42]  Hiroyuki Shindo,et al.  Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information , 2017, CoNLL.

[43]  Aleksandra Klasnja-Milicevic,et al.  Protus 2.1: Applying Collaborative Tagging for Providing Recommendation in Programming Tutoring System , 2016, ICWL.

[44]  Alun D. Preece,et al.  The role of idioms in sentiment analysis , 2015, Expert Syst. Appl..