Event Classification in Surabaya on Twitter with Support Vector Machine

Twitter is a social media that is often used by many people in the world. The information is spread and obtained through social media. For example, there is a company that is organizing a new event that many people need to know. This allows the creation of a system that supports the presentation of user information by detecting certain events from Twitter's social media data. In this study, tweet data will be retrieved using Twitter API and stored in JSON format. Furthermore, there will be a pre-processing which includes the deletion of characters, number, URL, stemming, and lower case. Furthermore, feature extraction is performed using Global Vector for Word Representation. we will classify into four classes, which are Competitions, Seminars, Festivals, and Other events. The classification is using SVM to predict the type of event. There are three experimental methods used, there is SVM C, SVM linear, and SVM Nu. SVM Nu was conducted with changes in the SVC parameters in the form of kernel and Nu to produce the best accuracy. Based on the experiments we have done, the best results are obtained with an accuracy of 85.2% by classification using the NuSVC method with an RBF kernel and nu parameter of 0.2.

[1]  Riyanarto Sarno,et al.  A comparative study of sentiment analysis using SVM and SentiWordNet , 2019, Indonesian Journal of Electrical Engineering and Computer Science.

[2]  Li Li,et al.  Support Vector Machines , 2015 .

[3]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[4]  Václav Hlavác,et al.  Multi-class support vector machine , 2002, Object recognition supported by user interaction for service robots.

[5]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[6]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[7]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[8]  Firoz Khan,et al.  Sentiment Analysis of Twitter Data , 2018, International Research Journal on Advanced Science Hub.

[9]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[10]  Riyanarto Sarno,et al.  Taxpayer compliance classification using C4.5, SVM, KNN, Naive Bayes and MLP , 2018, 2018 International Conference on Information and Communications Technology (ICOIACT).

[11]  L. S. Davis,et al.  An assessment of support vector machines for land cover classi(cid:142) cation , 2002 .

[12]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Tianli Yu,et al.  Kernelized structural SVM learning for supervised object segmentation , 2011, CVPR 2011.

[15]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[16]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Michael Rabadi,et al.  Kernel Methods for Machine Learning , 2015 .

[19]  Edi Faisal,et al.  Word Sense Disambiguation in Bahasa Indonesia Using SVM , 2018, 2018 International Seminar on Application for Technology of Information and Communication.

[20]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[21]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[22]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.

[23]  Riyanarto Sarno,et al.  Personality classification based on Twitter text using Naive Bayes, KNN and SVM , 2015, 2015 International Conference on Data and Software Engineering (ICoDSE).