Detecting Abusive Comments in Discussion Threads Using Naïve Bayes

Comments are supported by various websites and provide a simple approach to increment user involvement. Users can generally comment on different types of media such as: social networks, blogs, forums and news articles. As discussions increasingly move toward online forums, the issue of insulting and abusive comments is becoming prevalent. In addition, a lots of comments are available due to these social media. Hence, it is not feasible for a human moderator to check each comments one by one and flag them as abusive or not abusive. For this reason, an automated classifier which is quick and efficient is necessary to detect such type of comments. To fulfill above purpose, in this paper a Naïve Bayes classifier is designed to detect abusive comments expressed in Bangla. Using a training corpus collected from “Youtube.com”, the Naïve Bayes classifier is employed to categorize comments as abusive or not abusive. Finally, the performance is evaluated by using 10-fold cross-validation on unprocessed data.

[1]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[2]  Sabir Ismail,et al.  Detecting Sentiment from Bangla Text using Machine Learning Technique and Feature Analysis , 2016 .

[3]  Vikas S. Chavan,et al.  Machine learning approach for detection of cyber-aggressive comments by peers on social media network , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[4]  Stan Matwin,et al.  Offensive Language Detection Using Multi-level Classification , 2010, Canadian Conference on AI.

[5]  Mumit Khan,et al.  Detecting flames and insults in text , 2008 .

[6]  Iraklis Varlamis,et al.  Detecting Aggressive Behavior in Discussion Threads Using Text Mining , 2017, CICLing.

[7]  Wasifa Chowdhury,et al.  Performing sentiment analysis in Bangla microblog posts , 2014, 2014 International Conference on Informatics, Electronics & Vision (ICIEV).

[8]  Chaoyi Pang,et al.  Semi-supervised Learning for Cyberbullying Detection in Social Networks , 2014, ADC.

[9]  Elizabeth F. Churchill,et al.  Using Crowdsourcing to Improve Profanity Detection , 2012, AAAI Spring Symposium: Wisdom of the Crowd.

[10]  Henry Lieberman,et al.  Modeling the Detection of Textual Cyberbullying , 2011, The Social Mobile Web.

[11]  Ying Chen,et al.  Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[12]  K. M. Azharul Hasan,et al.  Opinion mining using Naïve Bayes , 2015, 2015 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE).

[13]  Walter Daelemans,et al.  Detection and Fine-Grained Classification of Cyberbullying Events , 2015, RANLP.

[14]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[15]  Dolf Trieschnigg,et al.  Experts and Machines against Bullies: A Hybrid Approach to Detect Cyberbullies , 2014, Canadian Conference on AI.

[16]  Ravi Kant,et al.  Comment spam detection by sequence mining , 2012, WSDM '12.

[17]  Brian D. Davison,et al.  Detection of Harassment on Web 2.0 , 2009 .

[18]  Adam Maus SVM Approach to Forum and Comment Moderation , 2009 .