N-Gram Based Sentiment Mining for Bangla Text Using Support Vector Machine

Opinion Mining is a valuable knowledge resource to understand the collective opinions and to take better decisions. It is a Natural Language Processing (NLP) task that decides whether a text expresses positive or negative sentiment. Web contents are increasing rapidly and providing a huge number of information. It is an important research issue to analyze and organize these enormous information for better knowledge extraction. In this paper, we emphasis on opinion mining for Bangla text using web based diverse data. We apply both Linear and Nonlinear Support Vector Machine as machine learning technique and N -gram method to classify Bangla documents collected from social media sites. Most of works in this arena take a single word as a vector. Instead of thinking a single word as a vector, we used one vector containing more than one words using N-gram. N-grams of texts are extensively used in text mining and natural language processing tasks. We found better results using N-grams for different values of n.

[1]  Do-Heon Jeong,et al.  Experimental study of time series-based dataset selection for effective text classification , 2017, 2017 9th International Conference on Knowledge and Smart Technology (KST).

[2]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[3]  K. M. Azharul Hasan,et al.  Opinion mining using Naïve Bayes , 2015, 2015 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE).

[4]  S. Foroozan,et al.  Improving Sentiment Classification Accuracy of Financial News Using N-Gram Approach and Feature Weighting Methods , 2015, 2015 2nd International Conference on Information Science and Security (ICISS).

[5]  Burairah Hussin,et al.  Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization , 2013 .

[6]  Marie-Francine Moens,et al.  Automatic Sentiment Analysis in On-line Text , 2007, ELPUB.

[7]  Vidyasagar Potdar,et al.  Computational approaches for emotion detection in text , 2010, 4th IEEE International Conference on Digital Ecosystems and Technologies.

[8]  Vikash Singh Rajput,et al.  Stock market sentiment analysis based on machine learning , 2016, 2016 2nd International Conference on Next Generation Computing Technologies (NGCT).

[9]  K. M. Azharul Hasan,et al.  Sentiment detection from Bangla text using contextual valency analysis , 2014, 2014 17th International Conference on Computer and Information Technology (ICCIT).

[10]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[11]  Duy Duc An Bui,et al.  Extractive text summarization system to aid data extraction from full text in systematic review development , 2016, J. Biomed. Informatics.

[12]  K. M. Azharul Hasan,et al.  Recognizing Bangla Grammar using Predictive Parser , 2012, ArXiv.

[13]  Sivaji Bandyopadhyay,et al.  Sentiwordnet for Bangla Sentiwordnet for Bangla , 2010 .

[14]  Erkki Sutinen,et al.  Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in Text , 2014, IEEE Transactions on Affective Computing.

[15]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[16]  William Stafford Noble,et al.  Support vector machine , 2013 .

[17]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[18]  K. M. Azharul Hasan,et al.  Sentiment Recognition from Bangla Text , 2013 .

[19]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..