Identifying Biased Subgroups in Ranking and Classification

When analyzing the behavior of machine learning algorithms, it is important to identify specific data subgroups for which the considered algorithm shows different performance with respect to the entire dataset. The intervention of domain experts is normally required to identify relevant attributes that define these subgroups. We introduce the notion of divergence to measure this performance difference and we exploit it in the context of (i) classification models and (ii) ranking applications to automatically detect data subgroups showing a significant deviation in their behavior. Furthermore, we quantify the contribution of all attributes in the data subgroup to the divergent behavior by means of Shapley values, thus allowing the identification of the most impacting attributes.

[1]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[2]  James R. Foulds,et al.  An Intersectional Definition of Fairness , 2018, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[3]  Ricardo Baeza-Yates,et al.  FA*IR: A Fair Top-k Ranking Algorithm , 2017, CIKM.

[4]  Minsuk Kahng,et al.  Visual exploration of machine learning results using data cube analysis , 2016, HILDA '16.

[5]  Minsuk Kahng,et al.  FAIRVIS: Visual Analytics for Discovering Intersectional Bias in Machine Learning , 2019, 2019 IEEE Conference on Visual Analytics Science and Technology (VAST).

[6]  Tim Kraska,et al.  Slice Finder: Automated Data Slicing for Model Validation , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[7]  Elena Baralis,et al.  Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence , 2021, SIGMOD Conference.

[8]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[11]  Anuj Karpatne,et al.  Introduction to Data Mining (2nd Edition) , 2018 .

[12]  L. Shapley A Value for n-person Games , 1988 .

[13]  Nisheeth K. Vishnoi,et al.  Ranking with Fairness Constraints , 2017, ICALP.

[14]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[15]  Tim Kraska,et al.  Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach , 2018, IEEE Transactions on Knowledge and Data Engineering.

[16]  Julia Stoyanovich,et al.  Measuring Fairness in Ranked Outputs , 2016, SSDBM.

[17]  Jeffrey Heer,et al.  Errudite: Scalable, Reproducible, and Testable Error Analysis , 2019, ACL.

[18]  Konstantinos Georgatzis,et al.  Auditing and Achieving Intersectional Fairness in Classification Problems , 2019, ArXiv.

[19]  Julia Stoyanovich,et al.  Fairness in Ranking: A Survey , 2021, ArXiv.

[20]  Carlos Castillo,et al.  Reducing Disparate Exposure in Ranking: A Learning To Rank Approach , 2018, WWW.

[21]  C. Dwork,et al.  Group Fairness Under Composition , 2018 .

[22]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..