Multilingual Cyber Abuse Detection using Advanced Transformer Architecture

The rise in the number of active online users has subsequently increased the number of cyber abuse incidents being reported as well. Such events pose a harm to the privacy and liberty of users in the digital space. Conventionally, manual moderation and reporting mechanisms have been used to ensure that no such text is present online. However, there have been some flaws in this method including dependency on humans, increased delays and reduced data privacy. Previous approaches to automate this process have involved using supervised machine learning and traditional recurrent sequence models which tend to perform poorly on non-English text. Given the rising diversity of users being a part of the cyberspace, a flexible solution able to accommodate multilingual text is the need of the hour. Furthermore, text in colloquial languages often hold pertinent context and emotion that is lost after translation. In this paper, we propose a generative deep-learning based approach which involves the use of bidirectional transformer-based BERT architecture for cyber abuse detection across English, Hindi and code-mixed Hindi English(Hinglish) text. The proposed architecture can achieve state-of-the-art results on the code-mixed Hindi dataset in the TRAC-1 standard aggression identification task while being able to achieve very good results on the English task leaderboard as well. The results achieved are without using any ensemble-based methods or multiple models and thus prove to be a better alternative to the existing approaches. Deep learning based models which perform well on multilingual text will be able to handle a broader range of inputs and thus can prove to be crucial in cracking down on such social evils.

[1]  Alexander F. Gelbukh,et al.  Aggression Detection in Social Media: Using Deep Neural Networks, Data Augmentation, and Pseudo Labeling , 2018, TRAC@COLING 2018.

[2]  Henry Lieberman,et al.  Modeling the Detection of Textual Cyberbullying , 2011, The Social Mobile Web.

[3]  Prasenjit Majumder,et al.  Filtering Aggression from the Multilingual Social Media Feed , 2018, TRAC@COLING 2018.

[4]  Ritesh Kumar,et al.  Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.

[5]  Vikas S. Chavan,et al.  Machine learning approach for detection of cyber-aggressive comments by peers on social media network , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Kelly Reynolds,et al.  Using Machine Learning to Detect Cyberbullying , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[8]  Thamar Solorio,et al.  RiTUAL-UH at TRAC 2018 Shared Task: Aggression Identification , 2018, TRAC@COLING 2018.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Teresa Gonçalves,et al.  Fully Connected Neural Network with Advance Preprocessor to Identify Aggression over Facebook and Twitter , 2018, TRAC@COLING 2018.