FairML : ToolBox for diagnosing bias in predictive modeling

Predictive models are increasingly deployed for the purpose of determining access to services such as credit, insurance, and employment. Despite societal gains in efficiency and productivity through deployment of these models, potential systemic flaws have not been fully addressed, particularly the potential for unintentional discrimination. This discrimination could be on the basis of race, gender, religion, sexual orientation, or other characteristics. This thesis addresses the question: how can an analyst determine the relative significance of the inputs to a black-box predictive model in order to assess the model's fairness (or discriminatory extent)? We present FairML, an endto-end toolbox for auditing predictive models by quantifying the relative significance of the model's inputs. FairML leverages model compression and four input ranking algorithms to quantify a model's relative predictive dependence on its inputs. The relative significance of the inputs to a predictive model can then be used to assess the fairness (or discriminatory extent) of such a model. With FairML, analysts can more easily audit cumbersome predictive models that are difficult to interpret. Thesis Supervisor: Dr. Lalana Kagal Title: Principal Research Scientist, CSAIL Thesis Supervisor: Professor Harold Abelson Title: Class of 1922 Professor of Computer Science and Engineering Thesis Supervisor: Professor Alex "Sandy" Pentland Title: Toshiba Professor of Media Arts and Sciences