Debunking Biases in Attention

Despite the remarkable performances in various applications, machine learning (ML) models could potentially discriminate. They may result in biasness in decision-making, leading to an impact negatively on individuals and society. Recently, various methods have been developed to mitigate biasness and achieve significant performance. Attention mechanisms are a fundamental component of many state-of-the-art ML models and may potentially impact the fairness of ML models. However, how they explicitly influence fairness has yet to be thoroughly explored. In this paper, we investigate how different attention mechanisms affect the fairness of ML models, focusing on models used in Natural Language Processing (NLP) models. We evaluate the performance of fairness of several models with and without different attention mechanisms on widely used benchmark datasets. Our results indicate that the majority of attention mechanisms that have been assessed can improve the fairness performance of Bidirectional Gated Recurrent Unit (BiGRU) and Bidirectional Long Short-Term Memory (BiLSTM) in all three datasets regarding religious and gender-sensitive groups, however, with varying degrees of trade-offs in accuracy measures. Our findings highlight the possibility of fairness being affected by adopting specific attention mechanisms in machine learning models for certain datasets

[1]  D. Zhu,et al.  Fairness-aware Vision Transformer via Debiased Self-Attention , 2023, ArXiv.

[2]  Thilo Hagendorff,et al.  Speciesist bias in AI: how AI applications perpetuate discrimination and unfair outcomes against animals , 2022, AI and Ethics.

[3]  Benjamin L. Edelman,et al.  Inductive Biases and Variable Creation in Self-Attention Mechanisms , 2021, ICML.

[4]  Fred Morstatter,et al.  Attributing Fair Decisions with Attention Interventions , 2021, TRUSTNLP.

[5]  Adam Lopez,et al.  Intrinsic Bias Metrics Do Not Correlate with Application Bias , 2020, ACL.

[6]  Seid Muhie Yimam,et al.  HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection , 2020, AAAI.

[7]  Imran Razzak,et al.  A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter , 2020, Multimedia Tools and Applications.

[8]  Imran Razzak,et al.  A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models , 2020, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[9]  Fei Wu,et al.  Dice Loss for Data-imbalanced NLP Tasks , 2019, ACL.

[10]  Lucy Vasserman,et al.  Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification , 2019, WWW.

[11]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Luís C. Lamb,et al.  Assessing gender bias in machine translation: a case study with Google Translate , 2018, Neural Computing and Applications.

[13]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Jianfeng Gao,et al.  A Nested Attention Neural Hybrid Model for Grammatical Error Correction , 2017, ACL.

[15]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[16]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[17]  Xiaokui Xiao,et al.  Coupled Multi-Layer Attentions for Co-Extraction of Aspect and Opinion Terms , 2017, AAAI.

[18]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[19]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[20]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[21]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[22]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[25]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[26]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Quan Do,et al.  Jigsaw Unintended Bias in Toxicity Classification , 2019 .

[28]  Xinying Xu,et al.  Self-Attention-Based BiLSTM Model for Short Text Fine-Grained Sentiment Classification , 2019, IEEE Access.