Explainable and Non-explainable Discrimination in Classification

Nowadays more and more decisions in lending, recruitment, grant or study applications are partially being automated based on computational models (classifiers) premised on historical data. If the historical data was discriminating towards socially and legally protected groups, a model learnt over this data will make discriminatory decisions in the future. As a solution, most of the discrimination free modeling techniques force the treatment of the sensitive groups to be equal and do not take into account that some differences may be explained by other factors and thus justified. For example, disproportional recruitment rates for males and females may be explainable by the fact that more males have higher education; treating males and females equally will introduce reverse discrimination, which may be undesirable as well. Given that the law or domain experts specify which factors are discriminatory (e.g. gender, marital status) and which can be used for explanation (e.g. education), this chapter presents a methodology how to quantify the tolerable difference in treatment of the sensitive groups. We instruct how to measure, which part of the difference is explainable and present the local learning techniques that remove exactly the illegal discrimination, allowing the differences in decisions to be present as long as they are explainable.