论文信息 - SVM based approach for opinion classification in Arabic written tweets

SVM based approach for opinion classification in Arabic written tweets

We propose a machine learning approach for automatically classifying opinions of Twitter texts written in Modern Standard Arabic (MSA). Tweets are classified as either positive, negative, neutral or non-opinion. Various features for opinion classification have been used which are mainly linguistic and numeric. Our in-house collected and developed training data consists of tweets preserving their specifications such as @usermentions, #hashtags which are used as tweet-particular features. Four machine learning algorithms were applied on our dataset: Support Vector Machine (SVM), Naive Bayes (NB), J48 decision tree and Random forest. The experiments results show that SVM gives the highest F measure (72%), while the j48 classifier gives the highest precision (70,97%). Our experimental results demonstrate that tweet's specific features can significantly improve classification performance in comparison to other features combination.

[1] Patrick Haffner,et al. Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[2] R. M. Duwairi,et al. Sentiment Analysis in Arabic tweets , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[3] Kareem Darwish,et al. Subjectivity and Sentiment Analysis of Modern Standard Arabic and Arabic Microblogs , 2013, WASSA@NAACL-HLT.

[4] Janyce Wiebe,et al. Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[5] Luis Alfonso Ureña López,et al. Bilingual Experiments with an Arabic-English Corpus for Opinion Mining , 2011, RANLP.

[6] Dan Jurafsky,et al. Automatic Extraction of Opinion Propositions and their Holders , 2004 .

[7] Ellen Riloff,et al. Finding Mutual Benefit between Subjectivity Analysis and Information Extraction , 2011, IEEE Transactions on Affective Computing.

[8] Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[9] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[10] Muhammad Abdul-Mageed,et al. AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis , 2012, LREC.

[11] Erik Marcadé,et al. Mining on Social Networks , 2011 .

[12] Matthieu Vernier,et al. Catégorisation des évaluations dans un corpus de blogs multi-domaine , 2009, Fouille de Données d'Opinions.

[13] Qiang Ye,et al. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches , 2009, Expert Syst. Appl..

[14] Hsinchun Chen,et al. Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.