Nowadays, social media has grown rapidly. This growth makes people enjoy sharing their activities in social media, including complains about products or services from companies. This behaviour is a big opportunity for a company to know sentiments of customer toward their company. However, classifying sentiments from messages in social media has several challenges. First, the language used in social media often does not have formal structure in their sentence, such as the use of abbreviations, change of letters to numbers, lack of punctuation marks, etc. Second, sentences in social media are domain independent so it's hard to classify the sentence. Related with these challenges, this paper discusses a method to classify sentiments on social media, which is written in Indonesian language. The method we use is to classify each sentence into several classes of sentiment. Before the classification, sentence transformation is done to transform the informal words in the sentence into formal words. The transformation method that we use is the deletion of punctuation mark, the tokenization, the conversion of number to letter, the reduction of repetition letter, the Levensthein distance and using corpus to formalize abbreviation. Formalized sentence will be the input to the classification model for data training and classification process. We classify the message into four classes: Neutral (fact, greetings, etc.), Question, Positive Sentiment, and Negative Sentiment. SVM (Support Vector Machine) and Maximum Entropy are used as the classification algorithms with machine learning features of count of positive, negative, and question word in sentence. From our experimental result, the best classification method is SVM that yields 86,66% accuracy.
[1]
Ruli Manurung,et al.
Machine Learning-based Sentiment Analysis of Automatic Indonesian Translations of English Movie Reviews
,
2008
.
[2]
I Levenshtein Vladimir.
BINARY CODES CAPABLE OF CORRECTING DELETIONS, INSERTIONS, AND REVERSALS
,
1966
.
[3]
C. E. Veni Madhavan,et al.
A non-syntactic approach for text sentiment classification with stopwords
,
2011,
WWW.
[4]
John Carroll,et al.
Automatic Seed Word Selection for Unsupervised Sentiment Classification of Chinese Text
,
2008,
COLING.
[5]
Bo Pang,et al.
Thumbs up? Sentiment Classification using Machine Learning Techniques
,
2002,
EMNLP.
[6]
Vladimir I. Levenshtein,et al.
Binary codes capable of correcting deletions, insertions, and reversals
,
1965
.
[7]
Alex Clark.
Pre-processing very noisy text
,
2003
.
[8]
A. Kaplan,et al.
Users of the world, unite! The challenges and opportunities of Social Media
,
2010
.
[9]
Ayu Purwarianti,et al.
Sentiment classification for Indonesian message in social media
,
2012,
CLOUD 2012.