A Hybrid Model for Social Media Sentiment Analysis for Indonesian Text

Sentiment analysis for Indonesian social media text is very important because the text content in social media is very diverse and requires an accurate method that can produce an analysis describing the state of the actual data. The main problem in sentiment analysis for Indonesian text on social media is unstructured text data and the use of non-standard languages such that sentiment analysis often produces errors. This paper focuses on sentiment analysis using a hybrid model that combines lexicon based and maximum entropy methods to classify the sentiments of Indonesian public opinion on government. The method consists of extracting datasets, preprocessing, lexicon-based classification, machine learning training, machine learning classification, and result interpretation. The results of the study produce 91 classifications of neutral sentiment, 51 document negative sentiments, 39 document positive sentiments and 152 document of mix sentiments. Based on the evaluation results, the hybrid sentiment model for Indonesian Language sentiment analysis on social media produced a pretty good accuracy score of 84.31% compared to previous studies. The implication of this study is to produce a sentiment analysis system with a hybrid method for Indonesian text on social media.

[1]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[2]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[3]  Edi Winarko,et al.  Analisis Sentimen Twitter untuk Teks Berbahasa Indonesia dengan Maximum Entropy dan Support Vector Machine , 2014 .

[4]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[5]  Huey Yee Lee,et al.  Chinese Sentiment Analysis Using Maximum Entropy , 2011 .

[6]  Ayu Purwarianti,et al.  Sentiment classification for Indonesian message in social media , 2011, Proceedings of the 2011 International Conference on Electrical Engineering and Informatics.

[7]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[8]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[9]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[10]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[11]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[12]  Ayu Purwarianti,et al.  Indonesian social media sentiment analysis with sarcasm detection , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[13]  Sotiris B. Kotsiantis,et al.  Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.

[14]  Fredrik Sommar,et al.  Combining Lexicon- and Learning-based Approaches for Improved Performance and Convenience in Sentiment Classification , 2015 .

[15]  Syopiansyah Jaya Putra,et al.  Sentiment Analysis for Popular e-traveling Sites in Indonesia using Naive Bayes , 2018, 2018 6th International Conference on Cyber and IT Service Management (CITSM).

[16]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[17]  Teddy Mantoro,et al.  Text mining for Indonesian translation of the Quran: A systematic review , 2017, 2017 International Conference on Computing, Engineering, and Design (ICCED).

[18]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[19]  Ruli Manurung,et al.  Machine Learning-based Sentiment Analysis of Automatic Indonesian Translations of English Movie Reviews , 2008 .

[20]  R. Rajasree,et al.  Sentiment analysis in twitter using machine learning techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[21]  Doaa Mohey El Din Mohamed Hussein,et al.  A survey on sentiment analysis challenges , 2016, Journal of King Saud University - Engineering Sciences.

[22]  Nipun Mehra,et al.  Sentiment Identification Using Maximum Entropy Analysis of Movie Reviews , 2002 .