论文信息 - Sentiment Analysis System of Indonesian Tweets using Lexicon and Naïve Bayes Approach

Sentiment Analysis System of Indonesian Tweets using Lexicon and Naïve Bayes Approach

Nowadays, social media become the most popular user-generated content on the internet. In Indonesia, there are 150 million active social media users or 56% of the total population in 2019 [1]. It shows that social media produced big data, which is potential to give us meaningful information. Twitter is one of the popular social media platforms in Indonesia, where 50% of internet users use this platform [1]. Therefore, we conducted a sentiment analysis of Indonesian tweets with two different approaches: lexicon-based and machine learning. To achieve this objective, we developed a system that can identify and categorize Indonesian tweets into specific polarity (positive, neutral, negative). This system consists of main processes: environment preparation, text preprocessing, read dataset bag of words, data processing, result, and conclusion. In the testing phase, we used several keywords as inputs to this sentiment analysis system. The results show naïve Bayes obtained an accuracy of 84% and lexicon-based was 72%. It concludes that the machine learning approach gives better accuracy than lexicon-based in our system.

Aan Kardiana | Mubarik Ahmad | Mochamad Ferdy Octaviansyah | Kukuh Fadli Prasetyo