Fairness-aware Outlier Ensemble

Outlier ensemble methods have shown outstanding performance on the discovery of instances that are significantly different from the majority of the data. However, without the awareness of fairness, their applicability in the ethical scenarios, such as fraud detection and judiciary judgement system, could be degraded. In this paper, we propose to reduce the bias of the outlier ensemble results through a fairness-aware ensemble framework. Due to the lack of ground truth in the outlier detection task, the key challenge is how to mitigate the degradation in the detection performance with the improvement of fairness. To address this challenge, we define a distance measure based on the output of conventional outlier ensemble techniques to estimate the possible cost associated with detection performance degradation. Meanwhile, we propose a post-processing framework to tune the original ensemble results through a stacking process so that we can achieve a trade off between fairness and detection performance. Detection performance is measured by the area under ROC curve (AUC) while fairness is measured at both group and individual level. Experiments on eight public datasets are conducted. Results demonstrate the effectiveness of the proposed framework in improving fairness of outlier ensemble results. We also analyze the trade-off between AUC and fairness.

[1]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[2]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[3]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[4]  Krishna P. Gummadi,et al.  iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[5]  Boi Faltings,et al.  Non-Discriminatory Machine Learning through Convex Fairness Criteria , 2018, AAAI.

[6]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[7]  Arthur Zimek,et al.  Ensembles for unsupervised outlier detection: challenges and research questions a position paper , 2014, SKDD.

[8]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[9]  Charu C. Aggarwal,et al.  Theoretical Foundations and Algorithms for Outlier Ensembles , 2015, SKDD.

[10]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[11]  Arthur Zimek,et al.  Subsampling for efficient and effective unsupervised outlier detection ensembles , 2013, KDD.

[12]  Yiannis Kompatsiaris,et al.  Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification , 2018, WWW.

[13]  James Y. Zou,et al.  Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[14]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[15]  Hans-Peter Kriegel,et al.  Interpreting and Unifying Outlier Scores , 2011, SDM.

[16]  Yunfeng Zhang,et al.  Data Augmentation for Discrimination Prevention and Bias Disambiguation , 2020, AIES.

[17]  Aditya Krishna Menon,et al.  The cost of fairness in binary classification , 2018, FAT.

[18]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[19]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[20]  Ian Davidson,et al.  A Framework for Determining the Fairness of Outlier Detection , 2020, ECAI.

[21]  Jing Gao,et al.  Converting Output Scores from Outlier Detection Algorithms into Probability Estimates , 2006, Sixth International Conference on Data Mining (ICDM'06).

[22]  Leman Akoglu,et al.  Less is More: Building Selective Anomaly Ensembles , 2015 .

[23]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[24]  Charu C. Aggarwal,et al.  Outlier ensembles: position paper , 2013, SKDD.

[25]  Arthur Zimek,et al.  An Unsupervised Boosting Strategy for Outlier Detection Ensembles , 2018, PAKDD.