Sentiment analysis of extremism in social media from textual information

Abstract Uncertainty in political, religious, and social issues causes extremism among people that are depicted by their sentiments on social media. Although, English is the most common language used to share views on social media, however, other vicinity based languages are also used by locals. Thus, it is also required to incorporate the views in such languages along with widely used languages for revealing better insights from data. This research focuses on the sentimental analysis of social media multilingual textual data to discover the intensity of the sentiments of extremism. Our study classifies the incorporated textual views into any of four categories, including high extreme, low extreme, moderate, and neutral, based on their level of extremism. Initially, a multilingual lexicon with the intensity weights is created. This lexicon is validated from domain experts and it attains 88% accuracy for validation. Subsequently, Multinomial Naive Bayes and Linear Support Vector Classifier algorithms are employed for classification purposes. Overall, on the underlying multilingual dataset, Linear Support Vector Classifier out-performs with an accuracy of 82%.

[1]  Bumsoo Kim,et al.  Perceived exposure to and avoidance of hate speech in various communication settings , 2019, Telematics Informatics.

[2]  Ana María Martínez Enríquez,et al.  Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits , 2010, MICAI.

[3]  Rishabh Kaushal,et al.  Analysis of Text Mining Techniques over Public Pages of Facebook , 2016, 2016 IEEE 6th International Conference on Advanced Computing (IACC).

[4]  Grzegorz Kondrak,et al.  A Comparison of Sentiment Analysis Techniques: Polarizing Movie Blogs , 2008, Canadian Conference on AI.

[5]  T. Váradi,et al.  TrendMiner: Large-scale analysis of political attitudes in public facebook messages , 2015, 2015 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[6]  Mohammad Shorif Uddin,et al.  Filtering political sentiment in social media from textual information , 2016, 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV).

[7]  Omar El Beqqali,et al.  Harnessing Semantic Features for Large-Scale Content-Based Hashtag Recommendations on Microblogging Platforms , 2017 .

[8]  Sajjad Haider,et al.  Impact analysis of adverbs for sentiment classification on Twitter product reviews , 2018, Concurr. Comput. Pract. Exp..

[9]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[10]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[11]  Mamata Jenamani,et al.  Sentiment wEight of N-grams in Dataset (SEND): A Feature-set for Cross-domain Sentiment Classification , 2017, 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR).

[12]  António Teixeira,et al.  Data extraction and preparation to perform a sentiment analysis using open source tools: The example of a Facebook fashion brand page , 2017, 2017 12th Iberian Conference on Information Systems and Technologies (CISTI).

[13]  Sunghyup Sean Hyun,et al.  Are depression and social anxiety the missing link between Facebook addiction and life satisfaction? The interactive effect of needs and self-regulation , 2019, Telematics Informatics.

[14]  Khurum Nazir Junejo,et al.  Harnessing English Sentiment Lexicons for Polarity Detection in Urdu Tweets: A Baseline Approach , 2017, 2017 IEEE 11th International Conference on Semantic Computing (ICSC).

[15]  Sanjida Akter,et al.  Sentiment analysis on facebook group using lexicon based approach , 2016, 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[16]  William E. Winkler Data Cleaning Methods , 2003 .

[17]  Shashank H. Yadav,et al.  An approach for offensive text detection and prevention in Social Networks , 2015, 2015 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS).

[18]  Christopher S. G. Khoo,et al.  Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons , 2018, J. Inf. Sci..

[19]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[20]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[21]  Zain Abbas,et al.  Exploring the Link between the Use of Facebook and Political Participation among Youth in Pakistan , 2018 .

[22]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[23]  Vishal Gupta,et al.  A Survey on Sentiment Analysis and Opinion Mining Techniques , 2013 .

[24]  Roger Eeckels,et al.  Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities , 2005, PLoS medicine.

[25]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[26]  Atif Mashkoor,et al.  Discovery and classification of user interests on social media , 2017 .

[27]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[28]  Rachna Jain,et al.  Sentiment analysis of E-commerce and social networking sites , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[29]  Martin Boldt,et al.  Crawling Online Social Networks , 2015, 2015 Second European Network Intelligence Conference.

[30]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[31]  B. Vinayaga Sundaram,et al.  Exploring gender based influencers using Social Network Analysis , 2014, 2014 Sixth International Conference on Advanced Computing (ICoAC).

[32]  Chengqi Zhang,et al.  Data preparation for data mining , 2003, Appl. Artif. Intell..

[33]  H. Abdi,et al.  Principal component analysis , 2010 .

[34]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[35]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[36]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[37]  Cagatay CATAL,et al.  A sentiment classification model based on multiple classifiers , 2017, Appl. Soft Comput..

[38]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[39]  Mike Thelwall,et al.  TensiStrength: Stress and relaxation magnitude detection for social media texts , 2016, Inf. Process. Manag..

[40]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[41]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[42]  Youngjoong Ko,et al.  Automatic Text Categorization by Unsupervised Learning , 2000, COLING.

[43]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[44]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[45]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[46]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[47]  Ying Chen,et al.  Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[48]  Brett G. Johnson Tolerating and managing extreme speech on social media , 2018, Internet Res..

[49]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[50]  Alla G. Kravets,et al.  Analysis of the social network facebook comments , 2016, 2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA).

[51]  Richard Frank,et al.  Sentiment crawling: Extremist content collection through a sentiment analysis guided web-crawler , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[52]  Ibrahim Abu El-Khair,et al.  Effects of Stop Words Elimination for Arabic Information Retrieval: A Comparative Study , 2017, ArXiv.