An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation

With the rapid growth of Text sentiment analysis, the demand for automatic classification of electronic documents has increased by leaps and bound. The paradigm of text classification or text mining has been the subject of many research works in recent time. In this paper we propose a technique for text sentiment classification using term frequency- inverse document frequency (TF-IDF) along with Next Word Negation (NWN). We have also compared the performances of binary bag of words model, TF-IDF model and TF-IDF with next word negation (TF-IDF-NWN) model for text classification. Our proposed model is then applied on three different text mining algorithms and we found the Linear Support vector machine (LSVM) is the most appropriate to work with our proposed model. The achieved results show significant increase in accuracy compared to earlier methods.

[1]  Sholom M. Weiss,et al.  From Textual Information to Numerical Vectors , 2015 .

[2]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[5]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[6]  Georg Lackermair,et al.  Importance of Online Product Reviews from a Consumer's Perspective , 2013 .

[7]  Shuo Xu,et al.  Bayesian Multinomial Naïve Bayes Classifier to Text Classification , 2017, MUE/FutureTech.

[8]  Zhu Zhang Weighing Stars: Aggregating Online Product Reviews for Intelligent E-commerce Applications , 2008, IEEE Intelligent Systems.

[9]  Sourabh Joshi,et al.  Comparative Study of Classification Algorithms used in Sentiment Analysis , 2014 .

[10]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[11]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[12]  Dietrich Klakow,et al.  A survey on the role of negation in sentiment analysis , 2010, NeSp-NLP@ACL.

[13]  Mohammad Atique,et al.  Applications of Text Classification using Text Mining , 2014 .

[14]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[15]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[16]  Yacine Ouzrout,et al.  Negation Handling in Sentiment Analysis at Sentence Level , 2017, J. Comput..