Fuzzy Multi-task Learning for Hate Speech Type Identification

In traditional machine learning, classifiers training is typically undertaken in the setting of single-task learning, so the trained classifier can discriminate between different classes. However, this must be based on the assumption that different classes are mutually exclusive. In real applications, the above assumption does not always hold. For example, the same book may belong to multiple subjects. From this point of view, researchers were motivated to formulate multi-label learning problems. In this context, each instance can be assigned multiple labels but the classifiers training is still typically undertaken in the setting of single-task learning. When probabilistic approaches are adopted for classifiers training, multi-task learning can be enabled through transformation of a multi-labelled data set into several binary data sets. The above data transformation could usually result in the class imbalance issue. Without the above data transformation, multi-labelling of data results in an exponential increase of the number of classes, leading to fewer instances for each class and a higher difficulty for identifying each class. In addition, multi-labelling of data is very time consuming and expensive in some application areas, such as hate speech detection. In this paper, we introduce a novel formulation of the hate speech type identification problem in the setting of multi-task learning through our proposed fuzzy ensemble approach. In this setting, single-labelled data can be used for semi-supervised multi-label learning and two new metrics (detection rate and irrelevance rate) are thus proposed to measure more effectively the performance for this kind of learning tasks. We report an experimental study on identification of four types of hate speech, namely: religion, race, disability and sexual orientation. The experimental results show that our proposed fuzzy ensemble approach outperforms other popular probabilistic approaches, with an overall detection rate of 0.93.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Björn Gambäck,et al.  Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[3]  Han Liu,et al.  Granular computing based machine learning: a big data processing approach , 2018 .

[4]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  David Robinson,et al.  Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[6]  Han Liu,et al.  Transformation of discriminative single-task classification into generative multi-task classification in machine learning context , 2017, 2017 Ninth International Conference on Advanced Computational Intelligence (ICACI).

[7]  Xiaoli Li,et al.  Learning from Positive and Unlabeled Examples with Different Data Distributions , 2005, ECML.

[8]  Sergei Ovchinnikov,et al.  Fuzzy sets and applications , 1987 .

[9]  James Banks,et al.  Regulating hate speech online , 2010 .

[10]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[11]  Han Liu,et al.  Suspended Accounts: A Source of Tweets with Disgust and Anger Emotions for Augmenting Hate Speech Data Sample , 2018, 2018 International Conference on Machine Learning and Cybernetics (ICMLC).

[12]  E. Casari Logic and the Foundations of Mathematics , 1981 .

[13]  Stefan Kramer,et al.  A label compression method for online multi-label classification , 2018, Pattern Recognit. Lett..

[14]  Jesse Read Multi-label Classication , 2013 .

[15]  James Banks European regulation of cross-border hate speech in cyberspace: The limits of legislation , 2011 .

[16]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[17]  Michael R. Berthold,et al.  Influence of fuzzy norms and other heuristics on "Mixed fuzzy rule formation" , 2004, Int. J. Approx. Reason..

[18]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[19]  Tomoaki Ohtsuki,et al.  Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection , 2018, IEEE Access.

[20]  Stan Matwin,et al.  Offensive Language Detection Using Multi-level Classification , 2010, Canadian Conference on AI.

[21]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[22]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[23]  Han Liu,et al.  Multi-task learning for intelligent data processing in granular computing context , 2018 .

[24]  Han Liu,et al.  Fuzzy rule based systems for interpretable sentiment analysis , 2017, 2017 Ninth International Conference on Advanced Computational Intelligence (ICACI).

[25]  Jörg Becker,et al.  Discussing the Value of Automatic Hate Speech Detection in Online Debates , 2018 .

[26]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[27]  Wanli Zuo,et al.  Learning from Positive and Unlabeled Examples: A Survey , 2008, 2008 International Symposiums on Information Processing.

[28]  Jesse Read,et al.  Multi-label Classification , 2014 .

[29]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[30]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[31]  Jieping Ye,et al.  Margin Based PU Learning , 2018, AAAI.

[32]  Pascale Fung,et al.  One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[33]  Pete Burnap,et al.  Us and them: identifying cyber hate on Twitter across multiple protected characteristics , 2016, EPJ Data Science.

[34]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[35]  Chara Bakalis,et al.  Rethinking cyberhate laws , 2018 .

[36]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[37]  Lotfi A. Zadeh,et al.  Fuzzy logic - a personal perspective , 2015, Fuzzy Sets Syst..

[38]  Michael R. Berthold,et al.  Mixed fuzzy rule formation , 2003, Int. J. Approx. Reason..

[39]  Irene Nemes,et al.  Regulating Hate Speech in Cyberspace: Issues of Desirability and Efficacy , 2002 .

[40]  Pete Burnap,et al.  A Fuzzy Approach to Text Classification With Two-Stage Training for Ambiguous Instances , 2019, IEEE Transactions on Computational Social Systems.

[41]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .