Discrimination detection by causal effect estimation

With more and more decisions being made by learnt algorithms from data, algorithmic discriminations have become a risk for civil rights. The detection of discrimination is a process of counterfactual reasoning. This paper proposes a general detection framework by combining a data mining method with a well established counterfactual reasoning framework, potential outcome model. The potential outcome model supports operational definitions of global and local discriminations and discriminations by combined factors, while a data mining method makes the detection efficient. The proposed method, instantiated by association rule mining with potential outcome model based causal effect estimation, is evaluated with four real world data sets and is compared with a Bayesian network (BN) based detection method. It is able to detect not only global discriminations that are detected by the BN based method, but also local and combined discriminations that the BN based method cannot find. The proposed method is efficient, and scales well with the data set size and the number of attributes.

[1]  S. Morgan,et al.  Matching Estimators of Causal Effects , 2006 .

[2]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[3]  Francesco Bonchi,et al.  Exposing the probabilistic causal structure of discrimination , 2015, International Journal of Data Science and Analytics.

[4]  Jun Sakuma,et al.  Prediction with Model-Based Neutrality , 2013, ECML/PKDD.

[5]  Civil rights,et al.  Sex Discrimination Act 1975 , 2012, Women’s Legal Landmarks : Celebrating the History of Women and Law in the UK and Ireland.

[6]  Franco Turini,et al.  Measuring Discrimination in Socially-Sensitive Decision Records , 2009, SDM.

[7]  Peter Spirtes,et al.  Introduction to Causal Inference , 2010, J. Mach. Learn. Res..

[8]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[9]  Xiangliang Zhang,et al.  Anti-discrimination Analysis Using Privacy Attack Strategies , 2014, ECML/PKDD.

[10]  Jiuyong Li,et al.  Mining combined causes in large data sets , 2015, Knowl. Based Syst..

[11]  Lu Zhang,et al.  On Discrimination Discovery Using Causal Networks , 2016, SBP-BRiMS.

[12]  P. Bickel,et al.  Sex Bias in Graduate Admissions: Data from Berkeley , 1975, Science.

[13]  Chris Clifton,et al.  Combating discrimination using Bayesian networks , 2014, Artificial Intelligence and Law.

[14]  Lu Zhang,et al.  Situation Testing-Based Discrimination Discovery: A Causal Inference Approach , 2016, IJCAI.

[15]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[16]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[17]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.